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Methods and means for improved targeted DNA insertion in plants. 
Field of the invention 

The current invention relates to the field of molecular plant biology, more specific to the 
field of plant genome engineering. Methods are provided for the directed introduction of a 
foreign DNA fragment at a preselected insertion site in the genome of a plant Plants 
containing the foreign DNA inserted at a particular site can now be obtained at a higher 
frequency and with greater accuracy than is possible with the currently available targeted 
DNA insertion methods. Moreover, in a large proportion of the resulting plants, the foreign 
DNA has only been inserted at the preselected insertion site, without the foreign DNA also 
having been inserted randomly at other locations in the plant's genome. The methods of the 
invention are thus an improvement, both quantitatively and qualitatively, over the prior art 
methods. Also provided are chimeric genes, plasmids, vectors and other means to be used in 
the methods of the invention. 

Background art 

The first generation of transgenic plants in the early 80' s of last century by Agrobacterium 
mediated transformation technology, has spurred the development of other methods to 
introduce a foreign DNA of interest or a transgene into the genome of a plant, such as PEG 
mediated DNA uptake in protoplast, microprojectile bombardment, silicon whisker mediated 
transformation etc. 

All the plant transformation methods however have in common that the transgenes 
incorporated in the plant genome are integrated in a random fashion and in unpredictable 
copy number. Frequently, the transgenes can be integrated in the form of repeals, either of 
the whole transgene or of parts thereof. Such a complex integration pattern may influence the 
expression level of the transgenes, e.g* by destruction of the transcribed KNA through 
posttranscriptional gene silencing mechanisms or by inducing methylation of the introduced 
DNA, thereby downregulating the transcriptional activity on die transgene. Also, the 
integration site per se can influence the level of expression of the transgene. The 
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combination of these factors results m a wide variation in the level of egression of the 
teaasgenes or foreign DNA of interest among different transgenic plant cell and plant Imes. 
Moreover, the integration of the foreign DNA of interest may have a disruptive effect on the 
region of the genome where the integration occurs, and can influence or disturb the normal 
function of that target region, thereby leading to, often undesirable, side-effects. 

Therefore, whenever the effect of introduction of a particular foreign DNA into a plant is 
investigated, it is required that a large number of transgenic plant lines are generated and 
analysed in order to obtain significant results. Likewise, in the generation of transgenic crop 
plants where a particular DNA of interest is introduced in plants to provide the transgenic 
plant 'with a desired, known phenotype, a large population of independently created 
transgenic plant lines or so-called events is created, to allow the selection of those plant hues 
with optimal expression of the transgenes, and with noinimal, or no, side-effects on the 
overall phenotype of the transgenic plant. Particularly in this field, it would be advantageous 
if this trial-and-error process could be replaced by a more directed approach, in view of the 
burdensome regulatory requirements and high costs associated with the repeated field trials 
required for the elimination of the unwanted transgenic events. Furthermore, it will be clear 
that the possfoility of targeted DNA insertion would also be beneficial in the process of so- 



The need to control transgene integration in plants has been recognized early on, and several 
methods have been developed in an effort to meet this need (for a review see Kumar and 
Fladung, 2001, Trends in Plant Science, 6, ppl55-159). These methods mostly rely on 
homologous recombination-based transgene integration, a strategy which has been 
successfuUy applied in prokaryotes and lower eukaryotes (see e.g. BP0317509 or the 
corresponding publication by PaszkowsM et 1988, EMBO J., 7, pp4021-4026). However, 
for plants, the predominant mechanism for transgene integration is based on ttlegttrmate 
recombination which involves little homology between the -combining DNA strands. A 
major challenge m this area is therefore the detection of the rare homologous recombmaUon 
events, which are masked by the far more efficient integration of the introduced foreign 
DNA via iUegitimate recombination. 
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One way of solving this problem is by selecting against the integration events that have 
occurred by illegitimate recombination, such as exemplified in W094/17176. 

Another way of solving the problem is by activation of the target locus and/or repair or donor 
DNA through the induction of double stranded DNA breaks via rare-cutting endonucleases, 
such as I-SceL This technique has been shown to increase the frequency of homologous 
recombination by at least two orders of magnitude using Agrobacteria to deliver the repair 
DNA to the plant cells (Puchta e t al 9 1996, Proc. Natl Acad. Set U.SjL> 93, pp5055-5060; 
Chilton and Que, Plant Physiol, 2003 ). 

WO96/14408 describes an isolated DNA encoding the enzyme I-SceL This DNA sequence 
can be incorporated in cloning and expression vectors, transformed cell lines and transgenic 
animals. The vectors are useful in gene mapping and site-directed insertion of genes. 

WO00/46386 describes methods of modifying, repairing, attenuating and inactivating a gene 
or other chromosomal DNA in a cell through I-Scel double strand break. Also disclosed are 
methods of treating or prophylaxis of a genetic disease in an individual in need thereof! 
Further disclosed are chimeric restriction endonucleases. 

However, there still remains a need for improving the frequency of targeted insertion of a 
foreign DNA in the genome of a eukaryotic cell, particularly in the genome of a plant cell. 
These and other problems are solved as described hereinafter in the different detailed 
embodiments of the invention, as well as in the claims. 

Summary of the invention 

In one embodiment, the invention provides a method for introducing a foreign DNA of 
interest, which may be flanked by a DNA region having at least 80% sequence identity to a 
DNA region flanking a preselected site, into a preselected site, such as an I-Scel site of a 
genome of a plant cell, such as a maize cell comprising the steps of 

(a) inducing a double stranded DNA break at the preselected site in the genome of the 
cell, e.g by introducing an I-Scel encoding gene; 
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(b) introducing the foreign DNA of interest into the plant cell ; 
characterized in that the foreign DNA is delivered by direct DNA transfer which may be 
accomplished by bombardment of microprojectiles coated with the foreign DNA of interest 
The I-Scel encoding gene can comprise a nucleotide sequence encoding the amino acid 
sequence of SEQ ID No 1, wherein said nucleotide sequence has a GC content of about 50% 
to about 60%, provided that 

i) the nucleotide sequence does not comprise a nucleotide sequence selected from the 
group consisting of GATAAT, TATAAA, AATATA, AATATT, GATAAA, 
AATGAA, AATAAG, AATAAA, AATAAT, AACCAA, ATATAA, AATCAA, 
ATACTA, ATAAAA, ATGAAA, AAGCAT, ATTAAT, ATACAT, AAAATA, 
ATTAAA, AATTAA, AATACA and CATAAA; 

ii) the nucleotide does not comprise a nucleotide sequence selected from the group 
consisting of CCAAT> ATTGG, GCAAT and ATTGC; 

iii) the nucleotide sequence does not comprise a sequence selected from Hie group 
consisting of ATTTA, AAGGT, AGGTA, GGTA or GCAGG; 

iv) the nucleotide sequence does not comprise a GC stretch consisting of 7 consecutive 
nucleotides selected from the group of G or C; 

v) the nucleotide sequence does not comprise a AT stretch consisting of 5 consecutive 
nucleotides selected from the group of A or T; and 

vi) the nucleotide sequence does not comprise the codons TTA, CTA* ATA, GTA, TCG, 
CCG, ACG and GCG. An example of such an I-Scel encoding gene comprises the 
nucleotide sequence of SEQ ID 4. 

The plant cell may be incubated in a plant phenolic compound prior to step a). 

In another embodiment, the invention relates to a method for introducing a foreign DNA of 
interest into a preselected site of a genome of a plant cell comprising the steps of 

(a) inducing a double stranded DNA break at the preselected site in the genome of the 
cell; 

(b) introducing the foreign DNA of interest into the plant cell ; 

characterized in that the double stranded DNA break is introduced by a rare cutting 
endonuclease encoded by a nucleotide sequence wherein said nucleotide sequence has a 
GC content of about 50% to about 60%, provided that 



„ wax +32 9 2231923 BAYER BIOSCIENCE NV 

uxi, xb. & 2 FAX +32 9 Q13 18-llw2 003 1 



5 



i) the nucleotide sequence does not comprise a nucleotide sequence selected from 
the group consisting of GATAAT, TATAAA, AATATA, AATATT, GATAAA, 
AATGAA, AATAAG, AATAAA, AATAAT, AACCAA, ATATAA, AATCAA, 
ATACTA, ATAAAA, ATGAAA, AAGCAT, ATTAAT, ATACAT, AAAATA, 
ATTAAA, AATTAA, AAT AC A and C ATAAA; 

ii) the nucleotide does not comprise a nucleotide sequence selected from the group 
consisting of CCAAT, ATTGG, GCAAT and ATTGC; 

iii) the nucleotide sequence does not comprise a sequence selected from the group 
consisting of ATTTA, AAGGT, AGGTA, GGTA or GCAGG; 

iv) the nucleotide sequence does not comprise a GC stretch consisting of 7 
consecutive nucleotides selected from the group of G or C; 

v) the nucleotide sequence does not comprise a AT stretch consisting of 5 
consecutive nucleotides selected from the group of A or T; and 

vi) the nucleotide sequence does not comprise the codons TTA, CTA, ATA, GTA, 
TCG, CCG, ACG and GCG. 

In yet another embodiment, the invention relates to a method for introducing a foreign DNA 
of interest into a preselected site of a genome of a plant cell comprising the steps of 

(a) inducing a double stranded DNA break at the preselected site in the genome of the 
cell; 

(b) introducing the foreign DNA of interest into the plant cell ; 

characterized in that prior to step a, the plant cells are incubated in a plant phenolic 
compound which may be selected from the group of acetosyringone (3 5 5-dimethoxy~4- 
hydroxyacetophenone), o^hydroxy-acetosyrmgone, sinapinic acid (3,5 dimethoxy-4- 
hydroxycinnamic acid), syringic acid (4-hydroxy-3,5 dimethoxybenzoic acid), ferulic acid 
(4-hydroxy-3-methoxycinnamic acid), catechol (1,2-dihydroxybenzene), p-hydroxybenzoic 
acid (4-hydroxybenzoic add), jS-resorcylic acid (2,4 dihydroxybenzoic acid), protocatechuic 
acid (3,4-dihydroxybenzoic acid), pyrrogallic acid (2,3,4 -trihydroxybenzoic acid), gallic 
acid (3,4,5-trihydroxybenzoic acid) and vanillin (3-methoxy-4-hydroxybenzaldehyde). 



18/11 '03 WN 16:53 FAX +32 9 2231923 BAYER BIOSCIENCE NY " 

014 18.11.2003 *1< 



6 



The invention also provides an isolated DNA fragment comprising a nucleotide sequence 
encoding the amino acid sequence of SEQ ID No 1, wherein the nucleotide sequence has a 
GC content of about 50% to about 60%, provided that 

i) the nucleotide sequence does not comprise a nucleotide sequence selected from 
the group consisting of GATAAT, TATAAA, AATATA, AATATT, GATAAA, 
AATGAA, AATAAG, AATAAA, AATAAT, AACCAA* ATATAA, AATCAA, 
ATACTA, ATAAAA, ATGAAA, AAGCAT, ATTAAT, ATACAT, AAAATA, 
ATTAAA, AATTAA, AAT ACA and CAT AAA; 

ii) the nucleotide does not comprise a nucleotide sequence selected from the group 
consisting of CCAAT, ATTGG > GCAAT and ATTGC; 

iii) the nucleotide sequence does not comprise a sequence selected from the group 
consisting of ATTTA, AAGGT, AGGTA, GGTA or GCAGG; 

iv) the nucleotide sequence does not comprise a GC stretch consisting of 7 
consecutive nucleotides selected from the group of G or C; 

v) the nucleotide sequence does not comprise a AT stretch consisting of 5 
consecutive nucleotides selected from the group of A or T; and 

vi) codons of said nucleotide sequence coding for leucine (Leu), isoleucine (Die), 
valine (Val), serine (Ser), proline (Pro), threonine (Thr), alanine (Ala) do not 
comprise TA or GC duplets in positions 2 and 3 of said codons. 

The invention also provides an isolated DNA sequence comprising the nucleotide sequence 
of SBQ ID No 4, as well as chimeric gene comprising the isolated DNA fragment according 
to the invention operably linked to a plant-expressible promoter and the ubc of such a 
chimeric gene to insert a foreign DNA into an I-Scel recognition site in the genome of a 
plant. 
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In yet another embodiment of the invention, a method is provided for introducing a foreign 
DNA of interest into a preselected site of a genome of a plant cell comprising the steps of 

a) inducing a double stranded DNA break at the preselected site in the genome of the cell 
by a rare cutting endonuclease 

b) introducing the foreign DNA of interest into the plant cell ; 
characterized in that said endonuclease comprises a nuclear localization signal. 

Brief description of the figures 

Table 1 represents the possible trinucleotide (codon) choices for a synthetic I-Scel coding 
region (see also the nucleotide sequence in SEQ ID No 2), 

Table 2 represents preferred possible trinucleotide choices for a synthetic I-Scel coding 
region (see also the nucleotide sequence in SEQ ID No 3). 

Figure 1: Schematic representation of the target locus (A) and the repair DNA (B) used in the 
assay for homologous recombination mediated targeted DNA insertion. The target locus after 
recombination is also represented (C). DSB site: double stranded DNA break site; 
3'g7:transcription termination and polyadenylation signal of A tumefaciens gene 7; neo: 
plant expressible neomycin phosphotransferase; 35S: promoter of the CaMV 35S transcript; 
5' bar : DNA region encoding the amino terminal portion of the phosphinotricin 
acetyitransferase; 3'nos: transcription termination and polyadenylation signal of A. 
tumefaciens nopaline synthetase gene; Pnos: promoter of the nopaline synthetase gene of A 
tumefaciens; 3*ocs: 3' transcription termination and polyadenylation signal of the octopine 
synthetase gene of A. tumefaciens. 

Detailed description 

The current invention is based on the following findings: 
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a) Introduction into the plant cells of the foreign DNA to be inserted via direct DNA 
transfer, particularly microprojectile bombardment unexpectedly increased the frequency 
of targeted insertion events. All of the obtained insertion events were targeted DNA 
insertion events, which occurred at the site of the induced double stranded DNA break. 
Moreover all of these targeted insertion events appeared to be exact recombination events 
between the provided sequence homology flanking the double stranded DNA break. Only 
about half of these events had an additional insertion of the foreign DNA at a site 
different from the site of the induced double stranded DNA break. 

b) Induction of the double stranded DNA break by transient expression, of a rare-cutting 
double stranded break inducing endonuclease, such as I-Scel, encoded by chimeric gene 
comprising a synthetic coding region for a rare-cutting endonuclease such as I-Scel 
designed according to a preselected set of rules surprisingly increased the quality of the 
resulting targeted DNA insertion events (i.e. the frequency of perfectly targeted DNA 
insertion events). Furthermore, the endonuclease had been equipped with a nuclear 
localization signal. 

c) Preincubation of the target cells in a plant phenolic compound, such as acetosyringone, 
further increased the frequency of targeted insertion at double stranded DNA breaks 
induced in the genome of a plant cell. 

Any of the above findings, either alone or in combination, improves the frequency with 
which homologous recombination based targeted insertion events can be obtained, as well as 
the quality of the recovered events. 

Thus, in one aspect, the invention relates to a method for introducing a foreign DNA of 
interest into a preselected site of a genome of a plant cell comprising the steps of 

(a) inducing a double stranded DNA break at the preselected site in the genome of the 
cell; 

(b) introducing the foreigji DNA cf interest into the plant cell ; 
characterized in that the foreign DNA is delivered by direct DNA transfer. 



As used herein "direct DNA transfer" is any method of DNA introduction into plant cells 
which does not involve the use of natural Agrobacterium spp. which is capable of 
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introducing DNA into plant cells. This includes methods well known in the art such as 
introduction of DNA by electroporation into protoplasts, introduction of DNA by 
electroporation into intact plant cells or partially degraded tissues or plant cells, introduction 
of DNA through the action of agents such as PEG and the like into protoplasts and 
particularly bombardment with DNA coated microprojectiles. Introduction of DNA by direct 
transfer into plant cells differs from Agrobacterium-mzdiated DNA introduction at least in 
that double stranded DNA enters the plant cell, in that the entering DNA is not coated with 
any protein, and in that the amount of DNA entering the plant cell may be considerably 
greater. Furthermore, DNA introduced by direct transfer methods, such as the introduced 
chimeric gene encoding a double stranded DNA break inducing eiudonuclease, may be more 
amenable to transcription, resulting in a better timing of the induction of the double stranded 
DNA break. Although not intending to limit the invention to a particular mode of action, it is 
thought that the efficient homology-recombination-based insertion of repair DNA or foreign 
DNA in the genome of a plant cell may be due to a combination of any of these parameters. 

Conveniently, the double stranded DNA break may be induced at the preselected site by 
transient expression after introduction of a plant expressible gene encoding a rare cleaving 
double stranded break inducing enzyme. As set forth elsewhere in this document, I-Scel may 
be used for that purpose to introduce a foreign DNA at an I-Scel recognition site. However, 
it will be immediately clear to the person skilled in the art that also other double stranded 
break inducing enzymes can be used to insert the foreign DNA at their respective recognition 
sites. A list of rare cleaving DSB inducing enzymes and their respective recognition sites is 
provided in Table I of WO 03/004659 (pages 17 to 20) (incorporated herein by reference). 
Furthermore, methods are available to design custom-tailored rare-cleaving endonucleases 
that recognize basically any target nucleotide sequence of choice* Such methods have been 
described e,g. in WO 03/080809, W094/18313 or WO95/09233 and in fealan et a!.> 2001, 
Nature Biotechnology 19, 656- 660; liu et ah 1997, Proc. Natl Acad. ScL USA 94, 5525- 
5530,) 

Thus, as used herein "a preselected site** indicates a particular nucleotide sequence in the 
plant nuclear genome at which location it is desired to insert the foreign DNA. A person 
skilled in the art would be perfectly able to either choose a double stranded DNA break 
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inducing ("DSBI") enzyme recognizing the selected target nucleotide sequence or engineer 
such a DSBI endonuclease. Alternatively, a DSBI endonuclease recognition site may be 
introduced into the plant genome using any conventional transformation method or by 
conventional breeding using a plant line having a DSBI endonuclease recognition site in its 
genome, and any desired foreign DNA may afterwards be introduced into that previously 
introduced preselected target site. 

The double stranded DNA break may be induced conveniently by transient introduction of a 
plant-expressible chimeric gene comprising a plant-expressible promoter region operably 
linked to a DNA region encoding a double stranded break inducing enzyme. The DNA 
region encoding a double stranded break inducing enzyme may be a synthetic DNA region, 
such as but not limited to, a synthetic DNA region whereby the codons are chosen according 
to the design scheme as described elsewhere in this application for I-Scel encoding regions. 

The double stranded break inducing enzyme may comprise a nuclear localization 
signal(NLS) [Raikhel, Plant Physiol 100: 1627-1632 (1992) and references therein], such as 
the NLS of SV40 large T-antigen [Kalderon et ah Cell 39: 499-509 (1984)]. The nuclear 
localization signal may be located anywhere in the protein, but is conveniently located at the 
N-tenninal end of the protein. The nuclear localization signal may replace one or more of the 
amino acids of the double stranded break inducing enzyme. 

As used herein "foreign DNA. of interest" indicates any DNA fragment which one may want 
to introduce at the preselected site. Although it is not strictly required, the foreign DNA of 
interest may be flanked by at least one nucleotide sequence region having homology to a 
DNA region flanking the preselected site. The foreign DNA of interest may be flanked at 
both sites by DNA regions h aving hom ology to both DNA regions flanking the preselected 
site. Thus the repair DNA molecules) introduced into the plant cell may comprise a foreign 
DNA flanked by one or two flanking sequences having homology to the DNA regions 
respectively upstream or downstream the preselected site. This allows to better control the 
insertion of the foreign DNA. Indeed, integration by homologous recombination will allow 
precise joining of the foreign DNA fragment to me plant nuclear genome up to the 
nucleotide level. 
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The flanking nucleotide sequences may vary in length, and should be at least about 10 
nucleotides in length. However, the flanking region may be as long as is practically possible 
(e.g. up to about 100-150 kb such as complete bacterial artificial chromosomes(BACs)). 
Preferably, the flanking region will be about 50 bp to about 2000 bp. Moreover the regions 
flanking the foreign DNA of interest need not be identical to the DNA regions flanking the 
preselected site and may have between about 80% to about 100% sequence identity, 
preferably about 95% to about 100% sequence identity with the DNA regions flanking the 
preselected site. The longer the flanking region, the less stringent the requirement for 
homology. Furthermore, it is preferred that the sequence identity is as high as practically 
possible in the vicinity of the location of exact insertion of the foreign DNA. 

Moreover, the regions flanking the foreign DNA of interest need not have homology to the 
regions immediately flanking the preselected site, but may have homology to a DNA region 
of the nuclear genome further remote from that preselected site. Insertion of the foreign 
DNA will then result in a removal of the target DNA between the preselected insertion site 
and the DNA region of homology, hi other words, the target DNA located between the 
homology regions will be substituted for the foreign DNA of interest. 

For the purpose of this invention, the "sequence identity" of two related nucleotide or amino 
acid sequences, expressed as a percentage, refers to the number of positions in the two 
optimally aligned sequences which have identical residues (xlOO) divided by the number of 
positions compared. A gap, i.e. a position in an alignment where a residue is present in one 
sequence but not in the other is regarded as a position with non-identical residues. The 
alignment of the two sequences is performed by the Needleman and Wunsch algorithm 
(Needleman and Wunsch 1970) Computer-assisted sequence alignment, can be conveniently 
performed using standard software program such as GAP which is part of the Wisconsin 
• Package Version 10.1 (Genetics Computer Group, Madison, Wisconsin, USA) using the 
default scoring matrix with a gap creation penalty of 50 and a gap extension penalty of 3. 

hi another aspect, the invention relates to a modified I-Scel encoding DNA fragment, and the 
use thereof to efficiently introduce a foreign DNA of interest into a preselected site of a 
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genome of a plant cell, whereby the modified 1-SceI encoding DNA fragment has a 
nucleotide sequence which has been designed to fulfill the following criteria: 

a) the nucleotide sequence encodes a functional I-Scel endonuclease, such as an I- 
Scel endonuclease having me amino acid sequence as provided in SEQ ID No 1 . 

b) foe nucleotide sequence has a GC content of about 50% to about 60% 

c) foe nucleotide sequence does not comprise a nucleotide sequence selected from 
foe group consisting of GATAAT, TATAAA AATATA, AATATT, GATAAA, 
AATGAA AATAAG, AATAAA AATAAT, AACCAA, ATATAA, AATCAA, 
ATACTA ATAAAA ATGAAA AAGCAT, ATTAAT, ATACAT, AAAATA 

ATT AAA, AATTAA AATACA and CATAAA; 

d) the nucleotide does not comprise a nucleotide sequence selected from the group 
consisting of CCAAT, ATTGG, GCAAT and ATTGC; 

e) foe nucleotide sequence does not comprise a sequence selected from the group 
consisting of ATTTA, AAGGT, AGGTA GGTA or GCAGG; 

f) the nucleotide sequence does not comprise a GC stretch consisting of 7 
consecutive nucleotides selected from foe group of G or C; 

g) foe nucleotide sequence does not comprise a GC stretch consisting of 5 
consecutive nucleotides selected from foe group of A or T; and 

h) foe nucleotide sequence does not comprise codons coding for Leu, lie, Val, Ser, 
Pro, Thr, Ala that comprise TA or CG duplets in positions 2 and 3 (i.e. foe 
nucleotide sequence does not comprise foe codons TTA, CTA, ATA GTA TCG, 
CCG, ACG and GCG). 

I-Scel is a site-specific endonuclease, responsible for intron mobility in mitochondria in 
Saccharomyces cerevisea. The enzyme is encoded by foe optional intron Sc LSU-1 of foe 
21S rRNA gene and initiates a double stranded DNA break at foe intron insertion site 
generating a 4 bp staggered cut with 3'OH overhangs. The recognition site of I-Scel 
endonuclease extends over an 18 bp non-symmetrical sequence (Colleaux et a!. 1988 Proc. 
Natl. Acad. Sri. USA 85: 6022-6026). The amino acid sequence for I-Scel and a universal 
code equivalent of foe mitochondrial I-Scel gene have been provided by e.g- WO 96/14408. 



WO 



96/14408 discloses that foe following variants of I-Scel protein are still functional: 
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• positions 1 to 10 can be deleted 

• position 36: Gly (G) is tolerated 

• position 40: Met (M) or Val (V) are tolerated 

• position 41 : Ser (S) or Asn (N) are tolerated 

• position 43: Ala (A) is tolerated 

• position 46: Val (V) or N (Asn) are tolerated 

• position 91: Ala (A) is tolerated 

• positions 123 and 156: Leu (L) is tolerated 

• position 223 : Ala (A) and Ser (S) are tolerated 

and synthetic nucleotide sequences encoding such variant I-Scel enzymes can also be 
designed and used in accordance with the current invention. 

A nucleotide sequence encoding the amino acid sequence of I-Scel, wherein the amino- 
terminally located 4 amino acids have been replaced by a nuclear localization signal (SEQ 
ID 1) thus consist of 244 trinucleotides which can be represented as Rl through R244. For 
each of these positions between 1 and 6 possible choices of trinucleotides encoding the same 
amino add axe possible. Table 1 sets forth the possible choices for the trinucleotides 
encoding the amino acid sequence of SEQ ID 1 and provides for the structural requirements 
(either conditional or absolute) which allow to avoid inclusion into the synthetic DNA 
sequence the above mentioned "forbidden nucleotide sequences". Also provided is the 
nucleotide sequence of the contiguous trinucleotides in UIPAC code. 

As used herein, the symbols of the UIPAC code have their usual meaning Le. N— A or C or 
GorT;R= Aor G; Y=C orT; B 23 C or G or T (not A); V= A or C or G (not T); D= A or G 
orT(notC);H-AorCorT(notG);K=GorT;M=AorC;S=GorC; W=AorT, 

Thus in one embodiment of the invention, an isolated synthetic DNA fragment is provided 
which comprises a nucleotide sequence as set forth in SEQ ID No 2, wherein the codons are 
chosen among the choices provided in such a way as to obtain a nucleotide sequence with an 
overall GC content of about 50% to about 60%, preferably about 54%-55% provided that the 
nucleotide sequence from position 28 to position 30 is not AAG; if the nucleotide sequence 
from position 34 to position 36 is AAT then the nucleotide sequence from position 37 to 
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position 39 is not ATT or ATA; if the nucleotide sequence form position 34 to position 36 is 
AAC then the nucleotide sequence from position 37 to position 39 is not ATT 
simultaneously with the nucleotide sequence from position 40 to position 42 being AAA; if 
the nucleotide sequence from position 34 to position 36 is AAC then the nucleotide sequence 
from position 37 to position 39 is not ATA; if the nucleotide sequence from position 37 to 
position 39 is ATT or ATA then the nucleotide sequence from position 40 to 42 is not AAA; 
the nucleotide sequence from position 49 to position 51 is not CAA; the nucleotide sequence 
from position 52 to position 54 is not GTA; the codons from the nucleotide sequence from 
position 58 to position 63 are chosen according to the choices provided in such a way that 
the resulting nucleotide sequence does not comprise ATTTA; if the nucleotide sequence 
from position 67 to position 69 is CCC then the nucleotide sequence from position 70 to 
position 72 is not AAT; if the nucleotide sequence from position 76 to position 78 is AAA 
then the nucleotide sequence from position 79 to position 81 is not TTG simultaneously with 
the nucleotide sequence from position 82 to 84 being CTN; if the nucleotide sequence from 
position 79 to position 81 is TTA or CTA then the nucleotide sequence from position 82 to 
position 84 is not TTA; the nucleotide sequence from position 88 to position 90 is not GAA; 
if the nucleotide sequence from position 91 to position 93 is TAT, then the nucleotide 
sequence from position 94 to position 96 is not AAA; if the nucleotide sequence from 
position from position 97 to position 99 is TCC or TCG or AGO then the nucleotide 
sequence from position 100 to 102 is not CCA simultaneously with the nucleotide sequence 
from position 103 to 105 being TTR; it the nucleotide sequence from position 100 to 102 is 
CAA then the nucleotide sequence from position 103 to 105 is not TTA; if the nucleotide 
sequence from position 109 to position 111 is GAA then the nucleotide sequence from 112 
to 114 is not TTA; if the nucleotide sequence from position 115 to 117 is AAT then the 
nucleotide sequence from position 1 18 to position 120 is not ATT or ATA; if the nucleotide 
sequence from position 121 to 123 is GAG then the nucleotide sequence from position 124 
to position 126; the nucleotide sequence from position 133 to 135 is not GCA; the nucleotide 
sequence from position 139 to position 141 is not ATT; if fee nucleotide sequence from 
position 142 to position 144 is GGA then fee nucleotide sequence from position 145 to 
position 147 is not TTA; if the nucleotide sequence from position 145 to position 147 is TTA 
then fee nucleotide sequence from position 148 to position 150 is not ATA simultaneously 
wife the nucleotide sequence from position 151 to 153 being TTR; if the nucleotide sequence 



18/11 '03 DIN 17:00 KAA +32 a zzaitfiso oiuooiciwu *w ^ ^ 2003 17^ 



15 



from position 145 to position 147 is CTA then the nucleotide sequence from position 148 to 
position 150 is not ATA simultaneously with the nucleotide sequence from position 151 to 
153 being TTR; if the nucleotide sequence from position 148 to position 150 is ATA then the 
nucleotide sequence from position 151 to position 153 is not CTA or TTG; if the nucleotide 
sequence from position 160 to position 162 is GCA then the nucleotide sequence from 
position 163 to position 165 is not TAC; if the nucleotide sequence from position 163 to 
position 165 is TAT then the nucleotide sequence from position 166 to position 168 is not 
ATA simultaneously with the nucleotide sequence from position 169 to position 171 being 
AGR; the codons from the nucleotide sequence from position 172 to position 177 are chosen 
according to the choices provided in such a way that the resulting nucleotide sequence does 
not comprise GCAGG; the codons from the nucleotide sequence fiom position 178 to 
position 186 are chosen according to the choices provided in such a way that the resulting 
nucleotide sequence does not comprise AGGTA; if the nucleotide sequence from position 
193 to position 195 is TAT, then the nucleotide sequence from position 196 to position 198 
is not TGC; the nucleotide sequence from position 202' to position 204 is not CAA; the 
nucleotide sequence from position 217 to position 219 is not AAT; if the nucleotide 
sequence from position 220 to position 222 is AAA then the nucleotide sequence from 
position 223 to position 225 is not GCA; if the nucleotide sequence fiom position 223 to 
position 225 is GCA then the nucleotide sequence from position 226 to position 228 is not 
TAC; if the nucleotide sequence from position 253 to position 255 is GAC, then the 
nucleotide sequence from position 256 to position 258 is not CAA; if the nucleotide 
sequence from position 277 to position 279 is CAT, then the nucleotide sequence from 
position 280 to position 282 is not AAA; the codons from the nucleotide sequence fiom 
position 298 to position 303 are chosen according to the choices provided in such a way that 
the resulting nucleotide sequence does not comprise ATTTA; if the nucleotide sequence 
from position 304 to position 306 is GGC then the nucleotide sequence from position 307 to 
position 309 is not AAT; the codons from the nucleotide sequence fiom position 307 to 
position 312 are chosen according to the choices provided in such a way that the resulting 
nucleotide sequence does not comprise ATTTA; the codons from the nucleotide sequence 
from position 334 to position 342 are chosen according to the choices provided in such a way 
that the resulting nucleotide sequence does not comprise ATTTA; if the nucleotide sequence 
from position 340 to position 342 is AAG then the nucleotide sequence from position 343 to 
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345 is not CAT; if the nucleotide position from position 346 to position 348 is CAA then the 
nucleotide sequence from position 349 to position 351 is not GCA; the codons from the 
nucleotide sequence from position 349 to position 357 are chosen according to the choices 
provided in such a way that the resulting nucleotide sequence does not comprise ATTTA; 
the nucleotide sequence from position 355 to position 357 is not AAT; if the nucleotide 
sequence from position 358 to position 360 is AAA then the nucleotide sequence from 
position 361 to 363 is not TTG; if the nucleotide sequence from position 364 to position 366 
is GCC then the nucleotide sequence from position 367 to position 369 is not AAT; the 
codons from the nucleotide sequence from position 367 to position 378 are chosen according 
to the choices provided in such a way that the resulting nucleotide sequence does not 
comprise ATTTA; if the nucleotide sequence from position 382 to position 384 is AAT then 
the nucleotide sequence from position 385 to position 387 is not AAT; the nucleotide 
sequence from position 385 to position 387 is not AAT; if the nucleotide sequence from 
position 400 to 402 is CCC, then the nucleotide sequence from position 403 to 405 is not 
AAT; if the nucleotide sequence from position 403 to 405 is AAT, then the nucleotide 
sequence from position 406 to 408 is not AAT; the codons from the nucleotide sequence 
from position 406 to position 411 are chosen according to the choices provided in such a 
way that the resulting nucleotide sequence does not comprise ATTTA; the codons from the 
nucleotide sequence from position 421 to position 426 are chosen according to the choices 
provided in such a way that the resulting nucleotide sequence does not comprise ATTTA; 
the nucleotide sequence from position 430 to position 432 is not CCA; if the nucleotide 
sequence from position 436 to position 438 is TCA then the nucleotide sequence from 
position 439 to position 441 is not TTG; the nucleotide sequence from position 445 to 
position 447 is not TAT; the nucleotide sequence from position 481 to 483 is not AAT; 
if the nucleotide sequence from position 484 to position 486 is AAA, then the nucleotide 
sequence from position 487 to position 489 is not AAT simultaneously with the nucleotide 
sequence from position 490 to position 492 being AGY; if the nucleotide sequence from 
position 490 to position 492 is TCA, ihen the nucleotide sequence from position 493 to 
position 495 is not ACC simultaneously with the nucleotide sequence from position 496 to 
498 being AAY; if the nucleotide sequence from position 493 to position 495 is ACC, then 
the nucleotide sequence from position 496 to 498 is not AAT; the nucleotide sequence from 
position 496 to position 498 is not AAT; if the nucleotide sequence from position 499 to 
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position 501 is AAA then the nucleotide sequence from position 502 to position 504 is not 
TCA or AGC; if the nucleotide sequence from position 508 to position 510 is GTA, then the 
nucleotide sequence from position 511 to 513 is not TTA; if the nucleotide sequence from 
position 514 to position 516 is AAT then the nucleotide sequence from position 517 to 
position 519 is not ACA; if the nucleotide sequence from position 517 to position 519 is 
ACC or ACG, then the nucleotide sequence from position 520 to position 522 is not CAA 
simultaneously with the nucleotide sequence from position 523 to position 525 being TCN; 
the codons from the nucleotide sequence from position 523 to position 531 are chosen 
according to the choices provided in such a way that the resulting nucleotide sequence does 
not comprise ATTTA; if the nucleotide sequence from position 544 to position 546 is GAA 
then the nucleotide sequence from position 547 to position 549 is not TAT, simultaneously 
with the nucleotide sequence from position 550 to position 552 being TTR; the codons from 
the nucleotide sequence from position 547 to position 552 are chosen according to the 
choices provided in such a way that the resulting nucleotide sequence does not comprise 
ATTTA; if the nucleotide sequence from position 559 to positon 561 is GGA then the 
nucleotide sequence from position 562 to position 564 is not TTG simultaneously with the 
nucleotide sequence from position 565 to 567 being CGN; if the nucleotide sequence from 
position 565 to position 567 is CGC then the nucleotide sequence from position 568 to 
position 570 is not AAT; the nucleotide sequence from position 568 to position 570 is not 
AAT; if the nucleotide sequence from position 574 to position 576 is TTC then the 
nucleotide sequence from position 577 to position 579 is not CAA simultaneously with the 
nucleotide sequence from position 580 to position 582 being TTR; if the nucleotide sequence 
from position 577 to position 579 is CAA then the nucleotide sequence from position 580 to 
position 582 is not TTA; if the nucleotide sequence from position 583 to position 585 is 
AAT the nucleotide sequence from position 586 to 588 is not TGC; the nucleotide sequence 
from position 595 to position 597 is not AAA; if the nucleotide sequence from position 598 
to position 600 is ATT then the nucleotide sequence from position 601 to position 603 is not 
AAT; the nucleotide sequence from position 598 to position 600 is not ATA; the nucleotide 
sequence from position 601 to position 603 is not AAT; if the nucleotide sequence from 
position 604 to position 606 is AAA then the nucleotide sequence from position 607 to 
position 609 is not AAT; the nucleotide sequence from position 607 to position 609 is not 
AAT; the nucleotide sequence from position 613 to position 615 is not CCA; if the 
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nucleotide sequence from position 613 to position 615 is CCG, then the nucleotide sequence 
from position 616 to position 618 is not ATA; if the nucleotide sequence from position 616 
to the nucleotide at position 618 is ATA, then the nucleotide sequence from position 619 to 
621 is not ATA; if the nucleotide sequence from position 619 to position 621 is ATA, then 
the nucleotide sequence from position 622 to position 624 is not TAC; the nucleotide 
sequence from position 619 to position 621 is not ATT; the codons from the nucleotide 
sequence from position 640 to position 645 are chosen according to the choices provided in 
such a way that the resulting nucleotide sequence does not comprise ATTTA; if the 
nucleotide sequence from position 643 to position 645 is TTA then the nucleotide sequence 
from position 646 to position 648 is not ATA; if the nucleotide sequence from position 643 
to position 645 is CTA then the nucleotide sequence from position 646 to position 648 is not 
ATA; the codons from the nucleotide sequence from position 655 to position 660 are chosen 
according to the choices provided in such a way that the resulting nucleotide sequence does 
not comprise ATTTA; if the nucleotide sequence from position 658 to 660 is TTA or CTA 
then the nucleotide sequence from position 661 to position 663 is not ATT or ATC; the 
nucleotide sequence from position 661 to position 663 is not ATA; if the nucleotide 
sequence from position 661 to position 663 is ATT then the nucleotide sequence from 
position 664 to position 666 is not AAA; the codons from the nucleotide sequence from 
position 670 to position 675 are chosen according to the choices provided in such a way that 
the resulting nucleotide sequence does not comprise ATTTA; if the nucleotide sequence 
from position 691 to position 693 is TAT then the nucleotide sequence from position 694 to 
position 696 is not AAA; if the nucleotide sequence from position 694 to position 696 is 
AAA then the nucleotide sequence from position 697 to position 699 is not TTG; if the 
nucleotide sequence from position 700 to position 702 is CCC then the nucleotide sequence 
from position 703 to position 705 is not AAT; if the nucleotide sequence from position 703 
to position 705 is AAT then the nucleotide sequence from position 706 to position 708 is not 
ACA or ACT; if the nucleotide sequence from position 706 to position 708 is ACA then the 
nucleotide sequence from position 709 to 71 1 is not ATA simultaneously with the nucleotide 
sequence from position 712 to position 714 being AGY; the nucleotide sequence does not 
comprise the codons TTA, CTA, ATA, GTA, TCG, CCG, ACG and GCG; said nucleotide 
sequence does not comprise a GC stretch consisting of 7 consecutive nucleotides selected 
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from the group of G or C; and the nucleotide sequence does not comprise a AT stretch 
consisting of 5 consecutive nucleotides selected from the group of A or T. 

A preferred group of synthetic nucleotide sequences is set forth in Table 2 and corresponds 
to an isolated synthetic DNA fragment is provided which comprises a nucleotide sequence as 
set forth in SEQ ID No 3, wherein the codons are chosen among the choices provided in such 
a way as to obtain a nucleotide sequence with an overall GC content of about 50% to about 
60%, preferably about 54%-55% provided that if the nucleotide sequence from position 121 
to position 123 is GAG then the nucleotide sequence from position 124 to 126 is not CAA; if 
the nucleotide sequence from position 253 to position 255 is GAC then the nucleotide 
sequence from position 256 to 258 is not CAA; if the nucleotide sequence from position 277 
to position 279 is CAT then the nucleotide sequence from position 280 to 282 is not AAA; if 
the nucleotide sequence from position 340 to position 342 is AAG then the nucleotide 
sequence from position 343 to position 345 is not CAT; if the nucleotide sequence from 
position 490 to position 492 is TCA then the nucleotide sequence from position 493 to 
position 495 is not ACC; if the nucleotide sequence from position 499 to position 501 is 
AAA then the nucleotide sequence from position 502 to 504 is not TCA or AGC; if the 
nucleotide sequence from position 517 to position 519 is ACC then the nucleotide sequence 
from position 520 to position 522 is not CAA simultaneous with the nucleotide sequence 
from position 523 to 525 being TCN; if the nucleotide sequence from position 661 to 
position 663 is ATT then the nucleotide sequence from position 664 to position 666 is not 
AAA; the codons from the nucleotide sequence from position 7 to position 15 are chosen 
according to the choices provided in such a way that the resulting nucleotide sequence does 
not comprise a stretch of seven contiguous nucleotides from the group of G or C; the codons 
from the nucleotide sequence from position 61 to position 69 are chosen according to the 
choices provided in such a way that the resulting nucleotide sequence does not comprise a 
stretch of seven contiguous nucleotides from the group of G or C; the codons from the 
nucleotide sequence from position 130 to position 138 are chosen according to the choices 
provided in such a way that. the resulting nucleotide sequence does not comprise a stretch of 
seven contiguous nucleotides from the group of G or C; the codons from the nucleotide 
sequence from position 268 to position 279 are chosen according to the choices provided in 
such a way that the resulting nucleotide sequence does not comprise a stretch of seven 
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contiguous nucleotides from the group of G or C; the codons from the nucleotide sequence 
from position 322 to position 333 are chosen according to the choices provided in such a way 
that 1he resulting nucleotide sequence does not comprise a stretch of seven contiguous 
nucleotides from the group of G or C; the codons from the nucleotide sequence from position 
460 to position 468 are chosen according to the choices provided in such a way that the 
resulting nucleotide sequence does not comprise a stretch of seven contiguous nucleotides 
from the group of G or C; the codons from the nucleotide sequence from position 13 to 
position 27 are chosen according to the choices provided in such a way that the resulting 
nucleotide sequence does not comprise a stretch of five contiguous nucleotides from the 
group of A or T; the codons from the nucleotide sequence from position 37 to position 48 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides from the group of A or 
T; the codons from the nucleotide sequence from position 184 to position 192 are chosen 
according to the choices provided in such a way that the resulting nucleotide sequence does 
not comprise a stretch of five contiguous nucleotides from the group of A or T; the codons 
from the nucleotide sequence from position 214 to position 219 are chosen according to the 
choices provided in such a way that the resulting nucleotide sequence does not comprise a 
stretch of five contiguous nucleotides from the group of A or T; the codons from the 
nucleotide sequence from position 277 to position 285 are chosen according to the choices 
provided in such a way that the resulting nucleotide sequence does not comprise a stretch of 
five contiguous nucleotides from the group of A or T; and the codons from the nucleotide 
sequence from position 388 to position 396 are chosen according to the choices provided in 
such a way that the resulting nucleotide sequence does not comprise a stretch of five 
contiguous nucleotides from the group of A or T; the codons from the nucleotide sequence 
from position 466 to position 474 are chosen according to the choices provided in such a way 
that the resulting nucleotide sequence does not comprise a stretch of five contiguous 
nucleotides from the group of A or T; the codons from the nucleotide sequence from position 
484 to position 489 are chosen according to the choices provided in such a way ihat the 
resulting nucleotide sequence does not comprise a stretch of five contiguous nucleotides 
from the group of A or T; the codons from the nucleotide sequence from position 571 to 
position 576 are chosen according to the choices provided in such a way that the resulting 
nucleotide sequence does not comprise a stretch of five contiguous nucleotides from the 
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group of A or T; the codons from the nucleotide sequence from position 598 to position 603 
are chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides from the group of A or 
T; the codons from the nucleotide sequence from position 604 to position 609 are chosen 
according to the choices provided in such a way that the resulting nucleotide sequence does 
not comprise a stretch of five contiguous nucleotides from the group of A or T; the codons 
from the nucleotide sequence from position 613 to position 621 are chosen according to the 
choices provided in such a way that the resulting nucleotide sequence does not comprise a 
stretch of five contiguous nucleotides from the group of A or T; the codons from the 
nucleotide sequence from position 646 to position 651 are chosen according to the choices 
provided in such a way that the resulting nucleotide sequence does not comprise a stretch of 
five contiguous nucleotides from the group of A or T; the codons from the nucleotide 
sequence from position 661 to position 666 are chosen according to the choices provided in 
such a way that the resulting nucleotide sequence does not comprise a stretch of five 
contiguous nucleotides from the group of A or T; and the codons from the nucleotide 
sequence from position 706 to position 714 are chosen according to the choices provided in 
such a way that the resulting nucleotide sequence does not comprise a stretch of five 
contiguous nucleotides from the group of A or T. 

The nucleotide sequence of SEQ ID No 4 is an example of such a synthetic nucleotide 
sequence encoding an I-Scel endonuclease which does no longer contain any of the 
nucleotide sequences or codons to be avoided. However, it will be clear that a person skilled 
in the art can readily obtain a similar sequence encoding I-Scel by replacing one or more 
(between two to twenty) of the nucleotides to be chosen for any of the alternatives provided 
in the nucleotide sequence of SEQ ID 3 (excluding any of the forbidden combinations 
described in the preceding paragraph) and use it to obtain a similar effect. 

For expression in plant cell, the synthetic DNA fragments encoding I-Seel may be operably 
linked to a plant expressible promoter in order to obtain a plant expressible chimeric gene. 

A person skilled in the art will immediately recognize that for this aspect of the invention, it 
is not required that the repair DNA and/or the DSBI endonuclease encoding DNA are 
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introduced into the plant cell by direct DNA transfer methods, but that the DNA may thus 
also be introduced into plant cells by Agrobacterinm-mediated transformation methods as are 
available in the art 

La yet another aspect, the invention relates to a method for introducing a foreign DNA of 
interest into a preselected site of a genome of a plant cell comprising the steps of 

(a) inducing a double stranded break at the preselected site in the genome of the cell ; 

(b) introducing the foreign DNA of interest into the plant cell ; 

characterized in that prior to step a, the plant cells are incubated in a plant phenolic 
compound 

"Plant phenolic compounds" or "plant phenolics" suitable for the invention are those 
substituted phenolic molecules which are capable to induce a positive chemotactic response, 
particularly those who are capable to induce increased vir gene expression in a Ti-plasmid 
containing Agrobacterium sp., particularly a Ti-plasmid containing Agrobacterium 
tumefaciens. Methods to measure chemotactic response towards plant phenolic compounds 
have been described by Ashby et aL (1988 J. Bacterid. 170: 4181-4187) and methods to 
measure induction of vir gene expression are also well known (Stachel et at., 1985 Nature 
318: 624-629 ; Bolton et ah 1986 Science 232: 983-985). Preferred plant phenolic compounds 
are those found in wound exudates of plant cells. One of the best known plant phenolic 
compounds is acetosyringone, which is present in a number of wounded and intact cells of 
various plants, albeit it in different concentrations. However, acetosyringone (3,5- 
dimethoxy-4-hydroxyacetophenone) is not the only plant phenolic which can induce the 
expression of vir genes. Other examples are ct-hydroxy-acetosyringone, sinapinic acid (3,5 
dimethoxy-4-hydroxycinnamic acid), syringic acid (4-hydroxy-3,5 dimethoxyb enzoic acid), 
ferulic acid (4-hydroxy-3-methoxycinnamic acid), catechol (1,2-dihydroxybenzene), p- 
hydroxybenzoic acid (4-hydroxybenzoic acid)* /3-resorcylic acid (2,4 dihydroxybenzoic 
acid), protocatechuic acid (3,4-dihydxoxj'bei!zoic acid), pyxrogalEc acid (23>4 - 
trihydroxybenzoic acid), gallic acid (3,4,5-trihydroxybenzoic acid) and vanillin (3-methoxy- 
4-hydroxybenzaldehyde). As used herein, the mentioned molecules are referred to as plant 
phenolic compounds. Plant phenolic compounds can be added to the plant culture medium 
either alone or in combination with other plant phenolic compounds. Although not intending 
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to limit the invention to a particular mode of action, it is thought that the apparent 
stimulating effect of these plant phenolics on cell division (and thus also genome replication) 
may be enhancing targeted insertion of foreign DNA. 

Plant cells are preferably incubated in plant phenolic compound for about one week, 
although it is expected incubation for about one or two days in or on a plant phenolic 
compound will he sufficient. Plant cells should be incubated for a time sufficient to stimulate 
cell division. According to Guivarc'h et aL (1993, Protoplasma 174: 10-18) such effect may 
already be obtained by incubation of plant cells for as little as 10 minutes. 

The above mentioned improved methods for homologous recombination based targeted 
DNA insertion may also be applied to improve the quality of the transgenic plant cells and 
plants obtained by direct DNA transfer methods, particularly by microprojectile 
bombardment It is well known in the art that introduction of DNA by microprojectile 
bombardment frequently leads to complex integration patterns of the introduced DNA 
(integration of multiple copies of the foreign DNA of interest, either complete or partial, 
generation of repeat structures). Nevertheless, some plant genotypes or varieties may be 
more amenable to transformation using microprojectile bombardment than to transformation 
using e.g. Agtobacterium tumefaciens. It would thus be advantageous if the quality of the 
transgenic plant cells or plants obtained through microprojectile bombardment could be 
improved, i.e. if the pattern of integration of the foreign DNA could be influenced to be 
simpler. 

The above mentioned f i n di ng that introduction of foreign DNA through microprojectile 
bombardment in the presence of an induced double stranded DNA break in the nuclear 
genome, whereby the foreign DNA has homology to the sequences flanking the double 
stranded DNA break frequently (about 50% of the obtained events) leads to simple 
integration patterns (single copy insertion in a predictable way and no insertion of additional 
fragments of the foreign DNA) provides the basis for a method of simplifying the complexity 
of insertion of foreign DNA in the nuclear genome of plant cells. 
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Thus the invention also relates to a method of producing a transgenic plant by 
microprojectile bombardment comprising the steps of 

(a) inducing a double stranded DNA break at a preselected site in the genome of a cell a 
plant, in accordance with the methods described elsewhere in this document; and 

(b) introducing the foreign DNA of interest into the plant cell by microprojectile 
bombardment wherein said foreign DNA of interest is flanked by two DNA regions 
having at least 80% sequence identity to the DNA regions flanking the preselected 
site in the genome of the plant. 

A significant portion of the transgenic plant population thus obtained will have a simple 
integration pattern of the foreign DNA in the genome of the plant cells, more particularly a 
significant portion of the transgenic plants will only have a one copy insertion of the foreign 
DNA, exactly between the two DNA regions flanking the preselected site in the genome of 
the plant This portion is higher than the population of transgenic plants with simple 
integration patterns, when the plants are obtained by simple microprojectile bombardment 
without inducing a double stranded DNA break, and without providing the foreign DNA 
with homology to the genomic regions flanking the preselected site. 

fo a convenient embodiment of the invention, the target plant cell comprises in its genome a 
marker gene, flanked by two recognition sites for a rare-cleaving double stranded DNA 
break inducing endonuclease, one on each side. This marker DNA may be introduced in the 
genome of the plant cell of interest using any method of transformation, or may have been 
introduced into the genome of a plant cell of another plant line or variety (such a as a plant 
line or variety easy amenable to transformation) and introduced into the plant cell of interest 
by classical breeding techniques. Preferably, the population of transgenic plants or plant cells 
comprising a marker gene flanked by two recognition sites for a rare-cleaving double 
stranded break inducing endonuclease ha* been analysed for the expression pattern of the 
marker gene (such as high expression, temporally or spatially regulated expression) and foe 
plant lines with the desired expression pattern identified Production of a transgenic plant by 
microprojectile bombardment comprising the steps of 

(a) inducing a double stranded DNA break at a preselected site in the genome of a cell of 
a plant hi accordance with the methods described elsewhere in this document; and 
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(b) introducing the foreign DNA of interest into the plant cell by microprojectile 
bombardment wherein said foreign DNA of interest is flanked by two DNA regions 
having at least 80% sequence identity to the DNA regions flanking the preselected 
site in the genome of the plant 

will lead to transgenic plant cells and plants wherein the marker gene has been replaced 

by the foreign DNA of interest 

The marker gene may be any selectable or a screenable plant-expressible marker gene, which 
is preferably a conventional chimeric marker gene. The chimeric marker gene can comprise a 
marker DNA that is under the control o£ and operatively linked at its 5' end to, a promoter, 
preferably a constitutive plant-expressible promoter, such as a CaMV 35S promoter, or a 
light inducible promoter such as the promoter of the gene encoding the small subunit of 
Rubisco; and operatively linked at its 3' end to suitable plant transcription termination and 
polyadenylation signals. The marker DNA preferably encodes an RNA, protein or 
polypeptide which, when expressed in the cells of a plant, allows such cells to be readily 
separated from those cells in which the marker DNA is not expressed. The choice of the 
marker DNA is not critical, and any suitable marker DNA can be selected in a well known 
manner. For example, a marker DNA can encode a protein that provides a distinguishable 
color to the transformed plant cell, such as the Al gene (Meyer et al (1987), Nature 330: 
677), can encode a fluorescent protein [Chalfie et al 9 Science 263: 802-805 (1994); Crameri 
et al 9 Nature Biotechnology 14: 315-319 (1996)], can encode a protein that provides 
herbicide resistance to the transformed plant cell, such as the bar gene, encoding PAT which 
provides resistance to phosphinothricin (EP 0242246), or can encode a protein that provides 
antibiotic resistance to the transformed cells, such as the aac(6*) gene, encoding GAT which 
provides resistance to gentamycin (WO 94/01560). Such selectable marker gene generally 
encodes a protein that confers to the cell resistance to an antibiotic or other chemical 
compound that is normally toxic for the cells. In plants the selectable marker gene may thus 
also encode a protein that confers resistance to a herbicide, such as a herbicide comprising a 
ghitamine synthetase inhibitor (e.g. phosphinothricin) as an active ingredient An example of 
such genes are genes encoding phosphinothricin acetyl transferase such as the sfr or sfrv 
genes (EP 242236; EP 242246; De Block et al, 1987 EMBO J. 6: 2513-2518). 
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The introduced repair DNA may further comprise a marker gene mat allows to better 
discriminate between integration by homologous recombination at the preselected site and 
the integration elsewhere in the genome. Such marker genes are available in the art and 
include marker genes whereby the absence of the marker gene can be positively selected for 
under selective conditions (e.g. codA, cytosyine deaminase from E> coli conferring 
sensitivity to 5-fluoro cytosine, Perera et ah 1993 Plant Mot Biol. 23, 793; Stougaard (1993) 
Plant J.: 755). The repair DNA needs to comprise the marker gene in such a way that 
integration of the repair DNA into the nuclear genome in a random way results in the 
presence of the marker gene whereas the integration of the repair DNA by homologous 
recombination results in the absence of the marker gene. 

It will be immediately clear that the same results can also be obtained using only one 
preselected site at which to induce the double stranded break, which is located in or near a 
marker gene. The flanking regions of homology are then preferably chosen in such way as to 
either inactive the marker gene, or delete the marker gene and substitute for the foreign DNA 
to be inserted 

It will be appreciated that the means and methods of the invention are particularly useful for 
corn, but may also be used in other plants with similar effects, particularly in cereal plants 
including wheat, oat, barley, rye, rice, turfgrass, sorghum, millet or sugarcane plants. Hie 
methods of the invention can also be applied to any plant including but not limited to cotton, 
tobacco, canola, oilseed rape, soybean, vegetables, potatoes, Lerana spp>, Nicotiana spp., 
Arabidopsis, alfelfa, barley, bean, com, cotton, flax, pea, rape, rice, rye, safflower, sorghum, 
soybean, sunflower, tobacco, wheat, asparagus, beet, broccoli, cabbage, carrot, cauliflower, 
celery, cucumber, eggplant, lettuce, onion, oilseed rape, pepper, potato, pumpkin, radish, 
spinach, squash, tomato, zucchini, almond, apple, apricot, banana, blackberry, blueberry, 
cacao, cherry, coconut, cranberry, date, grape, grapefruit, guava, kiwi, lemon, lime, mango, 
melon, nectarine, orange, papaya, passion fruit, peach, peanut, pear- pineapple, pistachio, 
plum, raspberry, strawberry, tangerine, walnut and watermelon. 

It is also an object of the invention to provide plant cells and plants comprising foreign DNA 
molecules inserted at preselected sites, according to the methods of the invention. Gametes, 
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seeds, embryos, either zygotic or somatic, progeny or hybrids of plants comprising the 
targeted DNA insertion events, which are produced by traditional breeding methods are also 
included within the scope of the present invention. 

The plants obtained by the methods described herein may be further crossed by traditional 
breeding techniques with other plants to obtain progeny plants comprising the targeted DNA 
insertion events obtained according to the present invention. 

The following non-limiting Examples describe the design of a modified I-Scel encoding 
chimeric gene, and the use thereof to insert foreign DNA into a preselected site of the plant 
genome. 

Unless stated otherwise in the Examples, all recombinant DNA techniques are carried out 
according to standard protocols as described in Sambrook et ah (1989) Molecular Cloning: A 
Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, NY and in 
Volumes 1 and 2 of Ausubel et ah (1994) Current Protocols in Molecular Biology, Current 
Protocols, USA. Standard materials and methods for plant molecular work are described in 
Plant Molecular Biology Labfex (1993) by R.D.D. Croy, jointly published by BIOS 
Scientific Publications Ltd (UK) and Blackwell Scientific Publications, UK. Other references 
for standard molecular biology techniques include Sambrook and Russell (2001) Molecular 
Cloning: A Laboratory Manual, Third Edition, Cold Spring Harbor Laboratory Press, NY, 
Volumes I and II of Brown (1998) Molecular Biology LabFax, Second Edition, Academic 
Press (UK). Standard materials and methods for polymerase chain reactions can be found in 
Dieffenbach and Dveksler (1995) PCR Primer A Laboratory Manual, Cold Spring Harbor 
Laboratory Press, and in McPherson at al. (2000) PCR - Basics: From Background to Bench* 
First Edition, Springer Verlag, Germany. 

Throughout the description and Examples, reference is made to the following sequences: 

SEQ ID No 1: amino acid sequence of a chimeric I-Scel comprising a nuclear localization 

signal linked to a I-Scel protein lacking the 4 atnino-tenninal amino acids. 

SEQ ID No 2: nucleotide sequence of I-Scel coding region (UIPAC code). 

SEQ ID No 3: nucleotide sequence of synthetic I-Scel coding region (UIPAC code). 

SEQ ID No 4: nucleotide sequence of synthetic I-Scel coding region. 
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SEQ 3D No 5: nucleotide sequence of the T-DNA of pTTAMTS (target locus). 
SEQ ID No 6: nucleotide sequence of the T-DNA of pTTA82(repair DNA). 
SEQ ID No 7: nucleotide sequence of pCV78. 
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PROVISIO 






NOTCAA 










NOTAAT 

IC D7V! AAA MAT 07* ft^A 


li nit nnn iwf wr » 

IF R75 GCA NOT R76 TAC 




















1 

CD 

in 

s 

u_ 
















IF R93 CAT NOT R94 AAA 1 














AVOID ATTTA | 




UIPAC code | 


TGY ! 


ATG 


CAR ! 


TTY ! 


GAR 1 


© 


AAR ! 


AAY 

A AD 


i o 


TAY 


ATG 1 


GAY 


CAY 


is 

© 


TGY 


TTRorCTN 


TTRorCTN 


TAY 


1 GAY 


! CAR 


1 TGG 


GTN 


1 TTRorCTN 


1 AGYorTCN 


CCN 


CCN 


I CAY 


AAR 


AAR 1 


GAR 


AGRorCGN 


GTN 


MY 


CAY 


TTRorCTN 


Possible trinucleotides 1 


TGCTGT i 


ATG 


CAACAG J 


TTCTTT 1 


GAAGAG J 


TGG 


AAAAAG 1 


AACAAT 


GCAGCCGCGGCT 


TACTAT i 


ATG 


GACGAT 


CAC CAT 


§ 


TGCTGT 


TTA TTG CTA CTC CTG CTT 


TTA TTG CTA CTC CTG CTT 


TACTAT 


GACGAT 


CAACAG 


TGG 


GTAGTCGTGGTT 


TTA TTG CTA CTC CTG CTT 


1 AGC AGT TCA TCC TCG TCT 


CCACCCCCGCCT 


1 CCACCCCCGCCT 


1 CAC CAT 


I AAAMG 


I 


[ GMGAG 


AGA AGG CGA C6C CGG CGT 


1 GTAGTCGTGGTT 


MCMT 


CAC CAT 


TTA TTG CTA CTC CTG CTT 
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O 
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CM 
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R93 


R94 
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I PROVISIO I 








I AVOID ATTTA I 






I NOT CCA I 




I IF Rf 46 TCA NOT R147 TTG 1 






I NOT TAT i 
























INOTAAT I 


IF R162 AAA NOT (R163 AAT AND R164 AGY) I 




IF R164 TCA NOT (R165 ACC AND R166 AAY) I 


IF R165 ACC NOT R166 AAT I 


NOT AAT 1 


IF R167 AAA R168 NOTTCA OR R168 NOT AGC I 






1 
© 

LL 




IF R172 AAT NOT R173 ACA 


IF R173 (ACC OR ACG) NOT <R174 CAA AND 


1 UIPACcode 


I GTN 


I GAR 


I AAY 


I TAY 


f TTRorCTN 


NOV | 


I CCN 


I ATG 


1 AGYorTCN 


I TTRorCTN 


Z 
O 
© 


I TAY 


I TGG 


Alii 


I ATG 


I GAY 


I GAY 


© 


! GGN 




TGG I 


GAY I 


TAY I 


AAY i 


AAR ! 


AAY 


AGYorTCN I 


NOV I 


AAY 


AAR I 


AGYorTCN I 


ATH 


i 


TTRorCTN I 


AAY 


I NOV 


1 Possible trinucleotides 


I GTAGTCGTGGTT 


I GAAGAG 


I AACAAT 


I TACTAT 


I TTA TTG CTA CTC CTG CTT 


I ACA ACC ACG ACT 


I CCACCCCCGCCT 


I ATG 


I AGC AGT TCA TCC TCG TCT 


I TTA TTG CTA CTC CTG CTT 


GCAGCCGCGGCT 


TACTAT 


TGG 


TTCTTT 


ATG 


! GACGAT 


GACGAT 


I GGAGGC GGGGGT 


GGAGGC GGGGGT 


AAAAAG 


rTGG 


GACGAT 


TACTAT 


AACAAT 


| 


AACAAT 


AGC AGT TCA TCC TCG TCT I 


ACA ACC ACG ACT 


AACAAT i 


AAAAAG | 


AGC AGT TCA TCC TCG TCT 


ATAATCATT 


GTAGTCGTGGTT 1 


TTA TTG CTA CTC CTG CTT 


AACAAT j 


ACA ACC ACG ACT | 


| AA 


> 


UJ 


z 


>■ 


-j 




Q. 


s 


co 


_j 


< 


>- 




LL. 


m 


a 


a 


© 


© 






a 
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z 




z 


CO 


f- 


z 




to 




> 


_i 


z 


H 


1 Trinucleotide 


j R138 


I R139 


IR140 


IR141 


I R142 


I R143 


I R144 


I R145 


TR146 


I R147 


I R148 


I R149 


I R150 


I R151 


\ R152 


I R153 


IR154 


|R155 


TR156 


TR157 


I R158 


I R159 


i R160 


I R161 


I R162 


CO 
CO 

or 


I R164 I 


R165 


1 R166 . ! 


R167 


R168 


R169 


R170 


R171 


R172 


I R173 J 
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R216 ATA 










*221 ATT 
S221 ATC 
































AND R238 AGY) I 






PROVISIO 

IF R207 ATA NOT R208 TAC 
NOT ATT 














r AVOID ATTTA 


r IF R215 (TTA OR CTA) NOT F 








I AVOID ATTTA 


zz 

H H 
OO 

a: 

OO 

ii 

o o 

U- Li. 


IF R221 ATT NOT R222 AAA 
NOT ATA 






I AVOID ATTTA 














MF R231TAT NOT R232 AAA 


I IF R232 AAA NOT R233 TTG 




IF 234 CCC NOT R235 AAT 


IF R235 AAT NOT R236 ACA 
I IF R235 AAT NOT R236 ACT 


I IF R236 ACA NOT (R237 ATA 






UlPACcode 
ATH 


TAY ! 


ATH 


GAY i 


AGYorTCN 


1 ATG 1 


1 AGYorTCN I 


ITAY 


I TTRorCTN I 


ATH 


I TTY i 


ITAY 


I AAY 


TTRorCTN 


ATH 


I AAR 


I CCN 


ITAY 


I TTRorCTN 


ATH 


1 CCN 


ICAR 


[ATG ' 


IATG 


[TAY 




I TTRorCTN 


CCN 


AAY 


I NOV | 


ATH 


AGYorTCN 


Possible trinucleotides 

ATAATCATT 


TACTAT 1 


ATAATCATT ! 


! GACGAT 1 


, AGC AGT TCA TCC TCG TCT 


1 ATG 1 


AGC AGT TCA TCC TCG TCT 1 


I TACTAT 


i TTA TTG CTA CTC CTG CTT I 


ATAATCATT 


TTCTTT J 


I TAC TAT 


AAC AAT i 


TTA TTG CTA CTC CTG CTT 


ATAATCATT 


I AAAAAG 


I CCACCCCCGCCT I 


I TAC TAT 


I TTA TTG CTA CTC CTG CTT 


ATAATCATT 


1 CCACCCCCGCCT 


I CAACAG 


1 ATG 


1 ATG 


I TAC TAT 


I AAAAAG 


I TTA TTG CTA CTC CTG CTT 


CCACCCCCGCCT 


AAC AAT 


I ACAACCACGACT 


ATAATCATT 


AGC AGT TCA TCC TCG TCT 
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Exemplified l-Scel 
(SEQIDNo4) 


IAGC 1 


1 


ICTC 1 


ICTG | 


1 


IGAG I 


ITAC 1 


I 


I AGC I 


I CAG I 


ICTG I 


I ATC I 


1 


ICTG 1 


i OW 


IATC 1 


GAG 


1 CAG 1 


ITTC I 


I GAA | 


1 GCT I 


GGC 1 


[ATC 1 


o 

CD 
CD 


CTG 


ATC 1 


CTG 


o 

CD 
CD 


GAT 1 


GCC 


TAC 1 


ATC 


AGA ~| 


TCC 


CGG 


PROVISIO 


































IF R41 GAG NOT R42 CAA 






































UIPAC 


1 AGCorTCM 


I AAR 


ICTS 


ICTS 


1 AAR 


IGAG 


|TAC 


I AAR 


I AGCorTCM 


I CAR 


ICTS 


I ATY 


1 GAR 


1 CTS 


AAC 


1 ATY 


GAR 


I CAR 


1 TTC 


I GAR 


1 GCY 


5 
CD 

o 


IATC 1 


s 

CD 
CD 


CTS 


1 ATY | 


CTS 


CD 
CD 


GAY j 


GCY ! 


TAC 


ATY | 


AGAorCGS 1 


AGCorTCMl 


AGAorCGS 1 


Choices 


1 AGCTCATCC 


I AAA AAG 


1 CTC CTG 


ICTC CTG 


I AAA AAG 


IGAG 


ITAC 


I AAA AAG 


I AGCTCATCC 


I CAACAG 


I CTC CTG 


I ATC ATT 


IGAAGAG 


ICTC CTG 


OW 


IATC ATT 


GAAGAG 


1 CAA CAG 


ITTC | 


I GAAGAG 


1 GCC GCT I 


cd 
o 

CD 
CD 


IATC I 


iGGC GGA 1 


CTC CTG 


[ATC ATT 1 


CTC CTG 


[GGCGGA 1 


1 GACGAT 


»- 

o 

CD 
O 
O 
CD 


TAC 1 


ATC ATT 


AGACGCCGG 1 


AGCTCATCC ! 


AGACGCCGG 1 


AA 


CO 


XL 


_i 


-i 




UJ 


>- 




CO 


a 


-j 




UJ 




z 




tu 


a 


u. 


UJ 


< 


O 




O 


_ i 






CD 


D 


< 








CO 


tc 


Trinucleotide 


1 R25 


I R26 


I R27 


I R28 


I R29 


1 R30 


I R31 


Si 

2 


I R33 


I R34 


in 

2 


I R36 


IR37 




R39 


1 R40 


1 


IR42 


1 R43 


1 


in 

5 




IR47 ! 


1 R48 1 




1 R50 ! 




IR52 1 


R53 




to 


R56 j 


R57 


R58 


a> 
& 
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Exemplified l-Sc 
(SEQ ID No 4) 

GAC 
GAA 
GGC 
AAG 
ACC 
TAC 
TGC 
ATG 
CAG 
TTC 


§gSSS8i*!5c1c5&£ 


CTG 


5 


TAC 


s* 

C5 


CAG 


o 
P 


© 


e 


AGC 


CCT 


CCT 


CAC 


1 


PROVISIO 










1 IF R85 GAC NOT R86 CAA 
















IF R93 CAT NOT R94 AAA 




< >-o: s o:>-o>-CDt!>o 
Eb <<C3<o<6k<h- 
5 8oo<<PP<oP 


of-5<3oP5ooof 


CTS 


ICTS 


ITAC 


1 GAY 


CAR 


1 TGG 


|GTS 


ICTS 


I AGCorTCM 


ICCH 


ICCH 


CAY 




Choices 

GACGAT 

GAAGAG 

GGCGGA 

AAAAAG 

ACCACT 

TAC 

TGCTGT 
ATG 
CAG 
TTC 


GAAGAG 

TGG 

AAAAAG 
AAC 

AAAAAG 
GCCGCT 
TAC 
ATG 

GAC GAT 
CAC CAT 
GTCGTG 


Tec iwi 
CTCCTG 


CTCCTG 


TAC 


GACGAT 


CAA CAG 


CD 
CO 
t— 


GTCGTG 


CTCCTG 


AGCTCATCC 


CCACCCCCT 


CCACCCCCT 


CAC CAT 


AAAAAG 


^ OlU(5!CI->-OSOiL 


,ius:i*:25«s:o-2Qi> 




-j 


> 


a 


O 




> 


_i 




a. 


OL 
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* 

•s 

0) 

o 




t- c\ 
GO OC 

on a 


c 

OC 

a 


2 a 

: a 


;« 

: a 


is 


: a 


1 \ 


» a 
! a 


IS 
: a 


! 5- 
: a 


• c\ 

! CT 
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1 JS 

> CD 

: c 


.8 
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Exemplified l-Scel 
(SEQ ID No 4) 


1 


E GAG 1 


I CGC 1 


1 GTG I 


% 


ICAT 1 


I CTG 1 


IGGC I 


I AAC I 


I CTC I 




I ATC I 


ACC ! 


I TGG I 


I GGA I 


GCC I 


CAG I 


ACC I 


TTC I 


AAG I 


CAC 


CAG I 


GCC I 


TTC j 


AAC I 


1 


CTG I 


GCC I 


AAC I 


CTG 


TTC 


ATC I 


GTG 


AAC 


1 


PROVISIO 








































IF R1 14 AAG NOT R1 15 CAT ! 
































UIPAC 


I AAR 


I GAR 


I AGAorCGS 


I GTS 


I AAC 


ICAY 


ICTS 


2 
O 
O 


OWI 


I CTS 


I GTS 


I ATY 


I ACY 


TGG 


2 
(3 
O 


GCY 


CAR I 


ACY I 


TTC | 


AAR | 


CAY 


CAR ! 


GCY ! 


TTC I 


AAC I 


AAR 


CTS | 


GCS i 


AAC I 


CTS 


Oil 


ATY I 


CTS 


OW 


AAC 


Choices 


AAAAAG 


GAAGAG 


AGACGCCGG 


GTCGTG 


AAC 


CAC CAT 


CTC CTG 


§ 

0 


AAC 


CTC CTG 


GTCGTG 


ATC ATT 


ACC ACT 


TGG 


GGCGGA 


GCCGCT 


CAACAG 


ACC ACT 


TTC 


AAAAAG 


CAC CAT 


CAACAG i 


GCC GCT 


TTC I 


AAC I 


| 


CTC CTG ! 


GCCGCT I 


AAC | 


CTC CTG 


I Oil 


ATC ATT I 


GTCGTG I 


I OW 


AAC I 


AA 




LU 


tr 


> 


z 


X 


_i 


0 


z 




> 




h- 




O 


< 


a 


1— 








a 


< 


Li. 


z 




_j 


< 


z 


_ 1 


Li. 




> 


z 


z 


Trinucleotide 


2 


to 
8 


R97 


1 R98 


I R99 


1 R100 


I R101 


I R102 


I R103 


1 R104 


I R105 


IR106 


R107 


I R108 


I R109 


I R110 


I R111 


I R112 


I R113 


I R114 


R115 


CD 
E 


I R117 I 


I R118 | 


I R119 ! 


R120 


I R121 I 


I R122 | 


[R123 ! 


R124 


LO 
CM 

a: 


I R126 i 


R127 I 


R128 ! 


R129 I 
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Exemplified i-Scel 
(SEQIDNo4) 


1 


© o 


o o 
- o 
< o 








o 


GAG . 


TAC 


CTC 


ACT 


CCC 


ATG J 


oov 


CD 


GCC 


TAC 


TGG 




ATG 


o 


GAC J 


< 

CD 
CD 


o 
o 

CD 


s 


CD 


GAC j 


TAC 


AAC 


AAG j 


AAC 


1 AGC _J 


































































SACC 


PROVISIO 
































































1 IF R164 TCA NOT R16 


UIPAC | 


AAR J 


IS 


> x 

K O 
< O 




AAC I 


CTS ! 


GTS 


GAR 

A A 


TAC 


CTS 


I ACY 


ICCY 


ATG 


1 AGCorTCM 


1 CTS 


AOS 


ITAC 


TGG 


p 


ATG 


IGAY 


IGAY 


CD 
CD 


CD 
CD 


1 AAR 


1 TGG 


CD 


ITAC 


1 AAC 


1 AAR 




I AGCorTCM 


Choices 


AAAAAG 1 


AAAAAG 
ACC ACT 


ATC ATT 
CCACCCCCT 


AAC ! 


AAC 


CTC CTG 


GTCGTG 


GAAGAG 


32 


CTC CTG 


I ACC ACT 


CCCCCT 


ATG 


AGCTCATCC ! 


CTC CTG 


GCC GCT 


TAC 


TGG 


TTC 


ATG 


GAC GAT 


GAC GAT 


GGC GGA 


GGC GGA 


AAAAAG 


TGG 


GAC GAT 


ITAC 


AAC 


AAAAAG 


ow 


AGCTCATCC 


AA 






— a. 








i > 


UJ 


z > 


■ _ 


i H 




5 


.cc 


r -J 


i < 


> 


% 


LL 




a 


a 


a 


CD 






a 


> 






z 


a> 


Trinucleotide 


a 


R131 


R133 


• ir 

> C 

: 5 


1 CC 
> or 
- <r- 

: a 


C 
- «c 

: a 


- cc 
> c 

5 5 


R139 


tr a 


: 5 


J c 
- j 

1 5 


: 5 


r if 
r ^ 

: 5 


) cc 

- «*■" 

: a 


> N 

' Tj 
• T- 

: a 


- oc 
; "3 

: a 


i g 
: a 


> C 

; ur. 

: 5 


1 t: 
> if, 

' EC 


!5 


cc 

: a 


13 
: 5 


♦ IT. 
> ifJ 
- x- 

: u 


► cc 

: £ 


> ifj 

: £ 


oc 

IT. 
£ 


03 
> Ifj 

: £ 


Q 

CC 

a 


» CO 

• x— 

: q: 
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2 
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Exemplified l-Scel 
(SEQID No 4) 




































































ACC 


owl 


1 


TCA 


I ATT 


GTG 


CTG 


AAC 


ACC 


I CAA 


o 

2 


OIL 


ACC 


TTC 


I GM 


GM 


GTG 


GAG 


TAC 


CTC 


GTC 


Img 


GGC 


CTG 


CGC 


OW 


AAG 


OIL 


CAG 


CTG 


OW 
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Examples 

Example I: Design, synthesis and analysis of a plant expressible chimeric gene 
encoding I-Scel. 

The coding region of I-Scel wherein the 4 aminoterminal amino acids have been replaced by 
a nuclear localization signal was optimized using die following process: 

1. Change the codons to the most preferred codon usage for maize without altering the 
amino acid sequence of I-Scel protein, using the Synergy Geneoptimizer™; 

2. Adjust the sequence to create or eliminate specific restriction sites to exchange the 
synthetic I-Scel coding region with the universal code I-Scel gene; 

3. Eliminate all GC stretches longer than 6 bp and AT stretches longer than 4 bp to 
avoid formation of secondary RNA structures than can effect pre-mRNA splicing 

4. Avoid CG and TA duplets in codon positions 2 and 3; 

5. Avoid other regulatory elements such as possible premature polyadenylation signals 
(GATAAT, TATAAA, AATATA, AATATT, GATAAA, AATGAA, AATAAG, 
AATAAA, AATAAT, AACCAA, ATATAA, AATCAA, ATACTA, ATAAAA, 
ATGAAA, AAGCAT, ATTAAT, ATACAT, AAAATA, ATT AAA, AATTAA, 
AATACA and CATAAA), cryptic intron sphce sites (AAGGTAAGT and 
TGCAGG), ATTTA pentamers and CCAAT box sequences (CCAAT, ATTGG, 
CGAAT and ATTGC); 

6. Recheck if the adapted coding region fulfill all of me above mentioned criteria. 

A possible example of such a nucleotide sequence is represented in SEQ ID No 4. A 
synthetic DNA fragment having the nucleotide sequence of SEQ ID No 4 was synthesized 
and operably linked to a CaMV35S promoter and a CaMV35S 3' temrination and 
polyadenylation signal (yielding plasmid pCV78; SEQ H> No 7). 

The synthetic I-Scel coding region was also cloned into abacterial expression vector (as a 
fusion protein allowing protein enrichment on amylose beads). The capacity of semi-purified 
I-Scel protein to cleave in vitro a plasmid containing an I-Scel recognition site was verified. 
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Example 2. Isolation of maize cell lines containing a promoterless bar gene preceded by 
an I-Scel site* 

Li order to develop an assay for double stranded DNA break induced homology-mediated 
recombination, maize cell suspensions were isolated that contained a promoterless bar gene 
preceded by an I-Scel recognition site integrated in the nuclear genome in single copy. Upon 
double stranded DNA break induction through delivery of an I-Scel endonuclease encoding 
plant expressible chimeric gene, and co-delivery of repair DNA comprising a CaMV 35S 
promoter operably linked to the 5'end of the bar gene, the 35S promoter may be inserted 
through homology mediated targeted DNA insertion, resulting in a functional bar gene 
allowing resistance to phosphinotricin (PPT). The assay is schematically represented in 
Figure 1. 

The target locus was constructed by operably linking through conventional cloning 
techniques the following DNA regions 

a) a 3 * end termination and polyadenylation signal from the nopaline synthetase gene 

b) a promoter-less bar encoding DNA region 

c) a DNA region comprising an I-Scel recognition site 

d) a 3' end termination and polyadenylation signal from A.tumefaciens gene 7 (3*g7) 

e) a plant expressible neomycin resistance gene comprising a nopaline synthetase promoter, 
aneomycine phosphotransferase gene, and a 3 ' ocs signal. 

This DNA region was inserted in a T-DNA vector between the T-DNA borders. The T-DNA 
vector was designated pTTAM78 (for nucleotide sequence of the T-DNA see SEQ ID No 5) 

The T-DNA vector was used directly to transform protoplasts of com according to the 
methods described in EP 0 469 273, using a He89-derived com cell suspension. The T-DNA 
vector was also introduced into Agrobacterium twnefaciens C58ClRif(pEHA101) and the 
resulting Agrobacterium was used to transform an He89-derived cell line. A number of target 
lines were identified that contained a single copy of the target locus construct pTTAM78, 
such as T24 (obtained by protoplast transformation) and lines 14-1 and 1-20 (obtained by 
Agrobacterium mediated transformation) 
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Cell suspensions were established from these target lines in N6M cell suspension medium, 
and grown in the light on a shaker (120 rpm) at 2S°C Suspensions were subcultured every 
week- 
Example 3: Homology based targeted insertion. 

The repair DNA pTTA82 is a T-DNA vector containing between the T-DNA borders the 
following operably linked DNA regions: 

a) a DNA region encoding only the aminotenninal part of the bar gene 

b) a DNA region comprising a partial I-Scel recognition site (13 nucleotides located at the 
5* end of the recognition site) 

c) a CaMV 35S promoter region 

d) a DNA region comprising a partial I-Scel recognition site (9 nucleotides located at the 3* 
end of fee recognition site) 

e) a 3 * end termination and poly adenylation signal from A tumefaciens gene 7 (3 *g7) 

f) a chimeric plant expressible neomycine resistance gene 

g) a defective I-Scel endonuclease encoding gene under control of a CaMV 35S promoter 

The nucleotide sequence of the T-DNA of pTTA82 is represented in SEQ ID NO 6. 

This repair DNA was co-delivered with pCV78 (see Example 1) by particle bombardment 
into suspension derived cells which were plated on filter paper as a thin layer. The filter 
paper was plated on Mahql VII substrate. 

The DNA was bombarded into the cells using aPDS-1000/He Biolistics device. Microcarrier 
preparation and coating of DNA onto microcarriers was essentially as described by Sanford 
et al 1992; Particle bombardment parameters were: target distance of 9cm; bombardment 
pressure of 1350 psi, gap distance of V" and macrocarrier flight distance of II cm. 
Immediately after bombardment the tissue was transferred onto non-selective MhilVU 
substrate- As a control for successful delivery of DNA by particle bombardment, the three 
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target lines were also bombarded wife microcamers coated with plasmid DNA comprising 
chimeric bar gene under the control of a CaMV35S promoter (pRVA52). 



a 



Four days after bombardment, the filters were transferred onto Mhl VH substrate 
supplemented with 25 mg/L PPT or on Ahxl.SVUinolOOO substrate supplemented with 50 
mg/L PPT. 



Fourteen days later, the filters were transferred onto fresh Mhl VH medium with 10 mg/L 
PPT for the target lines T24 and 14-1 and Mhl VH substrate with 25 mg/L PPT for target 
line 1-20. 



Two weeks later, potential targeted insertion events were scored based on their resistance to 
PPT. These PPT resistant events were also positive in the Liberty Link Corn Leaf/Seed test 
(Strategic Diagnostics Inc.). 



Number of PPT resistant calli 38 days after bombardment: 



Target line 


pRVA52 


p'llA82+pCV78 




Total number of 
PPT 11 events 


Mean number of 

p pT R 

events/petridish 


Total number of 
PPT R events 


Mean number of 

ppjR 

events/petridish 


1-20 


75 


25 


115 


7.6 


14-1 


37 


12.3 


38 


2.2 


24 


40 


13.3 


2 


0.13 



The PPT resistant events were further subcultured on Mhl VH substrate containing 10 mg/L 
PPT and callus material was used for molecular analysis. Twenty independent candidate TSI 
were analyzed by Southern analysis using the 35S promoter and the 3' end termination and 
polyadenylation signal from the nopaline synthase gene as a probe. Based on the size of the 
expected fragment, all events appeared to be perfect targeted sequence insertion events. 
Moreover, further analysis of about half of the targeted sequence insertion events did not 
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show additional non-targeted integration of either the repair DNA or the I-Scel encoding 
DNA. 

Sequence analysis of DNA amplified from eight of the targeted insertion events 
demonstrated that these events were indeed perfect homologous recombination based TSI 
events. 

Based on these data, the ratio of homologous recombination based DNA insertion versus the 
"normal" legitimate recombination varies from about 30% for 1-20 to about 17% for 14-1 
and to about 1% for 24. 

When using vectors similar to the ones described in Puchta et al, 1996 (supra) delivered by 
electroporation to tobacco protoplasts in the presence of I-Scel induced double stranded 
DNA breaks, the ratio of homologous recombination based DNA insertion versus normal 
insertion was about 15%. However, only one of out of 33 characterized events was a 
homology-mediated targeted sequence insertion event whereby fixe homologous 
recombination was perfect at both sides of the double stranded break. 

Using the vectors from Example 2, but with a "universal code I-Scel construct" comprising a 
nuclear localization signal, the ratio of HR based DNA insertion versus normal insertion 
varied between 0.032% and 16% for different target lines, both using electroporation or 
Agrobacterium mediated DNA delivery. The relative frequency of perfect targeted insertion 
events differed between the different target fines, and varied from 8 to 70% for 
electroporation mediated DNA delivery and between 73 to 90% for Agrobacterium mediated 
DNA delivery. 

, i a , ^ *ho fiwimenev of recovery of 

Example 4. Acetosyrmgtrae pr^-urciro&v.™ q " ~ ' 

targeted insertion events. 

One week before bombardment as described in Example 3, cell suspensions were either 
diluted in N6M medium or in LSIDhyl.5 medium supplemented with 200 uM 
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acetosyringone. Otherwise, the method as described in Example 3 was employed. As can be 
seen from the results summarized in the following table, preincubation of the cells to be 
transformed with acetosyringone had a beneficial effect on the recovery of targeted PPT 
resistant insertion events. 



Target line 


Preincubation with acetosyringone 


No preincubation 




Total number of 
PPT 11 events 


Mean number of 
PPT* 

events/petridish 


Total number of 
PPT* events 


Mean number of 
PPT R 

events/petridish 


1-20 


89 


7.6 


26 


3.7 


14-1 


32 


3.6 


6 


0.75 


24 


0 


0 


2 


0.3 
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Example 5: Media composition 

Mahql VEf: N6 medium (Chu et ah 1975) supplemented with lOOmg/L casein hydrolysate, 6 
iriM L-proline, 0.5g/L 2-(N-moipholino)efhaaesulfotdc acid (MBS), 0.2M mannitol, 0.2M 
sorbitol, 2% sucrose, lmg/L 2,4-dichlorophenoxy acetic acid (2,4-D), adjusted to pH5.8, 
solidified wife 2,5 g/L Gelrite®. 

MhilVU: N6 medium (Clm al 1975) supplemented with 0.5g/L 2-(N- 
morpholino)ethanesulfonic acid (MES), 0.2M mannitol, 2% sucrose, lmg/L 2,4- 
dichlorophenoxy acetic acid (2,4-D), adjusted to pH5.8 solidified with 2,5 g/L Gelrite®. 

MhlVII: idem to MhilVn substrate but without 0.2 M mannitol. 

Ahxl .SVIIinol 000: MS salts, supplemented with lOOGmg/L myoinositol, 0.1 mg/L 
thiamine-HCl, 0.5mg/L nicotinic acid, 0.5mg/L pyridoxine-HCl, 0.5g/L MBS, 30g/L 
sucrose, lOg/L glucose, L5mg/L 2,4-D, adjusted to pH 5.8 solidified with 2,5 g/L Gelrite®. 

LSIDhyl.5: MS salts supplemented with 0,5mg/L nicotinic acid, 0.5mg/L pyridoxine-HCl, 
lmg/L thiamine-HCl, lOOmg/L myo-inositol, 6mM L-proline, 0.5g/L MES, 20g/L sucrose, 
lOg/L glucose, 1 .5mg/L 2.4-D, adjusted to pH 5 J2. 

N6M: macro elements: 2830mg/L KNO3; 433mg/L (NH4)2S0 4 ; 166mg/L CaCl 2 *2H 2 0; 250 
mg/L MgSo 4 .7HaO; 400mg/L KH 2 P0 4 ; 37.3mg/L Na^EDTA; 27.3mg/L FeSo4-7H 2 0, MS 
micro elements, 500mg/L Bactotrypton, 0.5g/L MES, lmg/L thiamin-HCl, 0.5mg/L nicotinic 
acid, 0.5mg/L pyridoxin-HCl, 2mg/L glycin, lOOmg/L myo-inositoL, 3% sucrose, 0.5mg/L 
2.4-D, adjusted to pH5.8. 
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What is claimed is : 

1. A method for introducing a foreign DNA of interest into a preselected site of a genome 
of a plant cell comprising the steps of 

(a) inducing a double stranded DNA break at the preselected site in the genome of the 
cell; 

(b) introducing the foreign DNA of interest into the plant cell ; 
characterized in that the foreign DNA is delivered by direct DNA transfer. 

2. The method of claim 1 wherein said direct DNA transfer is accomplished by 
bombardment of microprojectiles coated with the foreign DNA of interest 

3. The method of claim 1 or 2, wherein said foreign DNA of interest is flanked by a DNA 
region having at least 80% sequence identity to a DNA region flanking the preselected 
site. 

4. The method of any one of claims 1 to 3, wherein said double stranded DNA break is 
induced by introduction of a I-Scel encoding gene. 

5. The method of claim 4 wherein said I-Scel encoding gene comprises a nucleotide 
sequence encoding the amino acid sequence of SEQ ID No 1, wherein said nucleotide 
sequence has a GC content of about 50% to about 60%, provided that 

vii) . said nucleotide sequence does not comprise a nucleotide sequence selected from the 

group consisting of GATAAT, TATAAA, AATATA, AATATT, GATAAA, 
AATGAA, AATAAG, AATAAA, AATAAT, AACCAA, ATATAA, AATCAA, 
ATACTA, ATAAAA, ATGAAA* AAGCAT, ATTAAT, ATACAT, AAAATA, 
ATTAAA, AATTAA, AATACA and CATAAA; 

viii) said nucleotide does not comprise a nucleotide sequence selected from the group 
consisting of CCAAT, ATTGG, GCAAT and ATTGC; 

ix) said nucleotide sequence does not comprise a sequence selected from the group 
consisting of ATTTA, AAGGT, AGGTA, GGTA or GCAGG; 



18/11 *03 DIN 17:33 FAX +32 9 2231923 BAYER BIOSCIENCE NV 6 

010 18.11.2003 * 17; 



52 

said nucleotide sequence does not comprise a GC stretch consisting of 7 consecutive 
nucleotides selected from the group of G or C; 

said nucleotide sequence does not comprise a AT stretch consisting of 5 consecutive 
nucleotides selected from the group of A or T; and 

said nucleotide sequence does not comprise the codons TTA, CTA, ATA, GTA, 
TCG, CCG, ACG and GCG. 

6. The method of claim 5, wherein the I-Scel encoding gene comprises the nucleotide 
sequence ofSEQ ID 4. 

7. The method of any of the foregoing claims, whereby the plant cell is a maize cell. 

8. The method of claim 7, wherein the maize cell is comprised within a cell suspension. 

9. The method of any of the foregoing claims, whereby said plant cell is incubated in a 
plant phenolic compound prior to step a). 

10. The method of claim 9, wherein said plant phenolic compound is acetosyringone. 

11. A method for introducing a foreign DNA of interest into a preselected site of a genome 
of a plant cell comprising the steps of 

(a) inducing a double stranded DNA break at the preselected site in the genome of the 
cell; 

(b) introducing the foreign DNA ofinterest into the plant cell ; 

characterized in that the double stranded DNA break is introduced by a rare cutting 
endonuclease encoded by a nucleotide sequence wherein said nucleotide sequence has a 
GC content of about 50% to about 60%, provided that 

i) said nucleotide sequence doe3 not comprise a nucleotide sequence selected from 
the group consisting of GATAAT, TATAAA, AATATA, AATATT, GATAAA, 
AATGAA, AATAAG, AATAAA, AATAAT, AACCAA, ATATAA, AATCAA, 
ATACTA, ATAAAA, ATGAAA, AAGCAT, ATTAAT, ATACAT, AAAATA, 
ATTAAA, AATTAA, AATACA and CATAAA; 



x) 
xi) 
xii) 
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ii) said nucleotide does not comprise a nucleotide sequence selected from the group 
consisting of CCAAT, ATTGG, GCAAT and ATTGC; 

iii) said nucleotide sequence does not comprise a sequence selected from the group 
consisting of ATTTA, AAGGT, AGGTA, GGTA or GCAGG; 

iv) said nucleotide sequence does not comprise a GC stretch consisting of 7 
consecutive nucleotides selected from the group of G or C; 

v) said nucleotide sequence does not comprise a AT stretch consisting of 5 
consecutive nucleotides selected from the group of A or T; and 

vi) said nucleotide sequence does not comprise the codons TTA, CTA, ATA, GTA, 
TCG, CCG, ACG and GCG. 

12. The method of claim 11, wherein the nucleotide sequence comprises the nucleotide 
sequence of SEQ ID 4. 

13. The method of claim 1 1 or 12, wherein the foreign DNA of interest is introduced into 
said plant cell by direct DNA transfer. 

14. The method of any one of claims 11 to 13, wherein said direct DNA transfer is 
accomplished by bombardment of microprojectiles coated with the foreign DNA of 
interest. 

15. The method of any one of claims 11 to 14, wherein said foreign DNA of interest is 
flanked by a DNA region having at least 80% sequence identity to a DNA region 
flanking the preselected site. 

16. The method of any one of claims 11 to 15, wherein said double stranded DNA break is 
induced by introduction of a I-Scel encoding gene. 

17- The method of any of the foregoing claims, whereby the plant cell is a maize cell. 

18. The method of claim 17, wherein the maize cell is comprised within a cell suspension. 
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19. The method of any of the foregoing claims, whereby said plant cell is incubated in a 
plant phenolic compound prior to step a). 

20. The method of claim 19, wherein said plant phenolic compound is acetosyringone- 

21. A method for introducing a foreign DNA of interest into a preselected site of a genome 
of a plant cell comprising the steps of 

(a) inducing a double stranded DNA break at the preselected site in the genome of the 
cell; 

(b) introducing the foreign DNA of interest into the plant cell ; 

characterized in that prior to step a, the plant cells are incubated in a plant phenolic 
compound 

22. The method according to claim 21, wherein said plant phenolic compound is selected 
from the group of acetosyringone (3,5-dimethoxy-4-hydroxyacetophenone), ce-hydroxy- 
acetosyringone, sinapinic acid (3,5 dimethoxy-4-hydroxycinnamic acid), syringic acid 
(4-hydroxy-3,5 dimethoxybenzoic acid), ferulic acid (4-hydroxy-3-methoxycinnamic 
acid), catechol (1,2-dihydroxybenzene), p-hydroxybenzoic acid (4-hydroxybenzoic acid), 
0-iesorcylic acid (2,4 dihydroxybenzoic acid), protocatechuic acid (3,4- 
dihydroxybenzoic acid), pyrrogallic acid (2,3,4 -trihydroxybenzoic acid), gallic acid 
(3,4,5-trihydroxybenzoic acid) and vanillin (3-methoxy-4-hydrox>4>enzaldehyde). 

23. A method for introducing a foreign DNA of interest into a preselected site of a genome 
of a plant cell comprising the steps of 

(a) inducing a double stranded DNA break at the preselected site in the genome of the 
cell by a rare cutting endonuclease ; 

(b) introducing the foreign DNA of interest into the plant cell ; 
characterized in that said endonuelease comprises a. nuclear localization signal. 

24. An isolated DNA fragment comprising a nucleotide sequence encoding the amino acid 
sequence of SEQ ID No 1 , wherein the nucleotide sequence has a GC content of about 
50% to about 60%, provided that 
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i) said nucleotide sequence does not comprise a nucleotide sequence selected from 
the group consisting of GATAAT, TATAAA, AATATA, AATATT, GATAAA, 
AATGAA, AATAAG, AATAAA* AATAAT, AACCAA, ATATAA, AATCAA, 
ATACTA, ATAAAA, ATGAAA, AAGCAT, ATTAAT, ATACAT, AAAATA, 
ATTAAA, AATTAA, AATAC A and CATAAA; 

if ) said nucleotide does not comprise a nucleotide sequence selected from the group 
consisting of CCAAT, ATTGG, GCAAT and ATTGC; 

iii) said nucleotide sequence does not comprise a sequence selected from the group 
consisting of ATTTA, AAGGT, AGGTA> GGTA or GCAGG; 

iv) said nucleotide sequence does not comprise a GC stretch consisting of 7 
consecutive nucleotides selected from the group of G or C; 

v) said nucleotide sequence does not comprise a AT stretch consisting of 5 
consecutive nucleotides selected from the group of A or T; and 

vi) codons of said nucleotide sequence coding for Leu, He, Val, Ser, Pro, Thr, Ala do 
not comprise TA or GC duplets in positions 2 and 3 of said codons. 

25. An isolated DNA fragment comprising the nucleotide sequence of SEQ ID No 2, wherein 
the GC content of said nucleotide sequence is about 50 to about 60%, provided that 

i) said nucleotide sequence from position 28 to position 30 is not AAG; 

ii) if the nucleotide sequence from position 34 to position 36 is AAT then the 
nucleotide sequence from position 37 to position 39 is not ATT or ATA; 

iii) if the nucleotide sequence form position 34 to position 36 is AAC then the 
nucleotide sequence from position 37 to position 39 is not ATT simultaneously 
with the nucleotide sequence from position 40 to position 42 being AAA; 

iv) if the nucleotide sequence from position 34 to position 36 is AAC then the 
nucleotide sequence from position 37 to position 39 is not ATA; 

v) if the nucleotide sequence from position 37 to position 39 is ATT or ATA then 
the nucleotide sequence from position 40 to 42 is not AAA; 

vi) the nucleotide sequence from position 49 to position 51 is not CAA; 

vii) the nucleotide sequence from position 52 to position 54 is not GTA; 
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viii) the codons ftom the nucleotide sequence from position 58 to position 63 are 
chosen according to the choices provided in such a way that the resulting 
nucleotide sequence does not comprise ATTTA; 

ix) if the nucleotide sequence from position 67 to position 69 is CCC then the 
nucleotide sequence from position 70 to position 72 is not AAT; 

x) if the nucleotide sequence from position 76 to position 78 is AAA then me 
nucleotide sequence from position 79 to position 81 is not TTG simultaneously 
with the nucleotide sequence from position 82 to 84 being CTN"; 

xi) if the nucleotide sequence from position 79 to position 81 is TTA or CTA then 
the nucleotide sequence from position 82 to position 84 is not TTA; 

xii) the nucleotide sequence ftom position 88 to position 90 is not GAA; 

xiii) if the nucleotide sequence from position 91 to position 93 is TAT, then the 
nucleotide sequence from position 94 to position 96 is not AAA; 

xiv) if the nucleotide sequence from position ftom position 97 to position 99 is 
TCC or TCG or AGC then the nucleotide sequence from position 100 to 102 is 
not CCA simultaneously with the nucleotide sequence from position 103 to 105 
being TTR; 

xv) it the nucleotide sequence ftom position 100 to 102 is CAA then the nucleotide 
sequence ftom position 103 to 105 is not TTA; 

xvi) if the nucleotide sequence from position 109 to position 1 11 is GAA then the 
nucleotide sequence ftom 112 to 114 is not TTA; 

xvii) if the nucleotide sequence ftom position 115 to 117 is AAT then the 
nucleotide sequence from position 1 18 to position 120 is not ATT or ATA; 

xvfii) if the nucleotide sequence from position 121 to 123 is GAG then the 
nucleotide sequence from position 124 to position 126; 

xix) the nucleotide sequRnofifrom position 133 to 135 is not GCA; 

xx) the nucleotide sequence from position 139 to position 141 is not ATT; 

xxi) if the nucleotide sequence from position 142 to position 144 is GGA then the 
nucleotide sequence from position 145 to position 147 is not TTA; 

xxii) if the nucleotide sequence from position 145 to position 147 is TTA then the 
nucleotide sequence from position 148 to position 150 is not ATA simultaneously 
with the nucleotide sequence from position 151 to 153 being TTR; 
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xxiii) if the nucleotide sequence from position 145 to position 147 is CTA then the 
nucleotide sequence from position 148 to position 150 is not ATA simultaneously 
with the nucleotide sequence from position 151 to 153 being TTR; 

xxiv) if the nucleotide sequence from position 148 to position 150 is ATA then the 
nucleotide sequence from position 151 to position 153 is not CTA or TTG; 

xxv) if the nucleotide sequence from position 160 to position 162 is GCA then the 
nucleotide sequence from position 163 to position 165 is not TAC; 

xxvi) if the nucleotide sequence from position 163 to position 165 is TAT then the 
nucleotide sequence from position 166 to position 168 is not ATA simultaneously 
with the nucleotide sequence from position 169 to position 171 being AGR; 

xxvii) the codons from the nucleotide sequence from position 172 to position 177 
are chosen according to the choices provided in such a way that the resulting 
nucleotide sequence does not comprise GCAGG; 

xxviii) the codons from the nucleotide sequence from position 178 to position 186 
are chosen according to the choices provided in such a way that the resulting 
nucleotide sequence does not comprise AGGTA; 

xxix) if the nucleotide sequence from position 193 to position 195 is TAT, then the 
nucleotide sequence from position 196 to position 198 is not TGC; 

xxx) the nucleotide sequence from position 202 to position 204 is not CAA; 

xxxi) the nucleotide sequence from position 217 to position 219 is not AAT; 

xxxii) if the nucleotide sequence from position 220 to position 222 is AAA then the 
nucleotide sequence from position 223 to position 225 is not GCA; 

xxxiii) if the nucleotide sequence from position 223 to position 225 is GCA then the 
nucleotide sequence from position 226 to position 228 is not TAC; 

xxxiv) if the nucleotide sequence from position 253 to position 255 is GAC, then the 
nucleotide sequence from position 256 to position 258 is not CAA; 

xxxv) if the nucleotide sequence from position 277 to position 279 is CAT, then the 
nucleotide sequence from position 280 to position 282 is not AAA; 

xxxvi) the codons from the nucleotide sequence from position 298 to position 303 
are chosen according to the choices provided in such a way that the resulting 
nucleotide sequence does not comprise ATTTA; 
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xxxvii) if the nucleotide sequence from position 304 to position 306 is GGC then the 
nucleotide sequence from position 307 to position 309 is not AAT; 

xxxviii) the codons from the nucleotide sequence from position 307 to position 
312 are chosen according to the choices provided in such a way that the resulting 
nucleotide sequence does not comprise ATTTA; 

xxxix) the codons from the nucleotide sequence from position 334 to position 342 
axe chosen according to the choices provided in such a way that the resulting 
nucleotide sequence does not comprise ATTTA; 

xl) if the nucleotide sequence from position 340 to position 342 is AAG then the 

nucleotide sequence from position 343 to 345 is not CAT; 
xli)if the nucleotide position from position 346 to position 348 is CAA then the 

nucleotide sequence from position 349 to position 351 is not GCA; 
xlii) the codons from the nucleotide sequence from position 349 to position 357 
are chosen according to the choices provided in such a way that the resulting 
nucleotide sequence does not comprise ATTTA; 
xliii) the nucleotide sequence from position 355 to position 357 is not AAT; 
xliv) if the nucleotide sequence from position 358 to position 360 is AAA then the 

nucleotide sequence from position 361 to 363 is not TTG; 
xlv) if the nucleotide sequence from position 364 to position 366 is GCC then the 

nucleotide sequence from position 367 to position 369 is not AAT; 
xlvi) the codons from the nucleotide sequence from position 367 to position 378 
are chosen according to the choices provided in such a way that the resulting 
nucleotide sequence does not comprise ATTTA; 
xlvii) if the nucleotide sequence from position 382 to position 384 is AAT then the 

nucleotide sequence from position 385 to position 387 is not AAT; 
xlviii) the nucleotide sequence from position 385 to position 387 is not AAT; 
xlix) if the nucleotide sequence from position 400 to 402 is CCC, then the 

nucleotide sequence from position 403 to 405 is not AAT; 
1) if the nucleotide sequence from position 403 to 405 is AAT, then the nucleotide 
sequence from position 406 to 408 is not AAT; 
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li) the codons from the nucleotide sequence from position 406 to position 411 are 

chosen according to the choices provided in such a way that the resulting 

nucleotide sequence does not comprise ATTTA; 
lii) the codons from the nucleotide sequence from position 421 to position 426 are 

chosen according to the choices provided in such a way that the resulting 

nucleotide sequence does not comprise ATTTA; 
liii)the nucleotide sequence from position 430 to position 432 is not CCA; 
liv)if the nucleotide sequence from position 436 to position 438 is TCA then the 

nucleotide sequence from position 439 to position 441 is not TTG; 
Iv) the nucleotide sequence from position 445 to position 447 is not TAT; 
lvi)the nucleotide sequence from position 481 to 483 is not AAT; 
lvii) if the nucleotide sequence from position 484 to position 486 is AAA, then the 

nucleotide sequence from position 487 to position 489 is not AAT simultaneously 

with the nucleotide sequence from position 490 to position 492 being AGY; 
Iviii) if the nucleotide sequence from position 490 to position 492 is TCA, then the 

nucleotide sequence from position 493 to position 495 is not ACC simultaneously 

with the nucleotide sequence from position 496 to 498 being AAY; 
lix)if the nucleotide sequence from position 493 to position 495 is ACC,. then the 

nucleotide sequence from position 496 to 498 is not AAT; 
Ix) the nucleotide sequence from position 496 to position 498 is not AAT; 
bri)if the nucleotide sequence from position 499 to position 501 is AAA then the 

nucleotide sequence from position 502 to position 504 is not TCA or AGC; 
Ixii) if the nucleotide sequence from position 508 to position 51 0 is GTA, then the 

nucleotide sequence from position 5 1 1 to 5 1 3 is not TTA; 
lxiii) if the nucleotide sequence from position 514 to position 516 is AAT then the 

nucleotide sequence from position 517 to position 519 is not ACA; 
Ixiv) if the nucleotide sequence from position 517 to position 519 is ACC or ACG, 

then the nucleotide sequence from position 520. to position 522 is not CAA 

simultaneously with the nucleotide sequence from position 523 to position 525 

being TCN; 
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lxv) the codons from 1he nucleotide sequence from position 523 to position 531 
are chosen according to the choices provided in such a way that the resulting 
nucleotide sequence does not comprise ATTTA; 

Ixvi) if the nucleotide sequence from position 544 to position 546 is GAA then me 
nucleotide sequence from position 547 to position 549 is not TAT, 
simultaneously with the nucleotide sequence from position 550 to position 552 
being TTR; 

btvii) the codons fiom the nucleotide sequence from position 547 to position 552 
are chosen according to the choices provided in such a way that the resulting 
nucleotide sequence does not comprise ATTTA; 
lxviii) if the nucleotide sequence from position 559 to positon 561 is GGA then the 
nucleotide sequence from position 562 to position 564 is not TTG simultaneously 
with the nucleotide sequence from position 565 to 567 being CGN; 
lxix) if the nucleotide sequence from position 565 to position 567 is CGC then the 

nucleotide sequence from position 568 to position 570 is not AAT; 
bat) the nucleotide sequence fiom position 568 to position 570 is not AAT; 
bexi) if the nucleotide sequence from position 574 to position 576 is TTC then the 
nucleotide sequence from position 577 to position 579 is not CAA simultaneously 
with the nucleotide sequence from position 580 to position 582 being TTR; 
Ixxii) if the nucleotide sequence from position 577 to position 579 is CAA then the 

nucleotide sequence from position 580 to position 582 is not TTA; 
Ixxiii) if the nucleotide sequence from position 583 to position 585 is AAT the the 

nucleotide sequence from position 586 to 588 is not TGC; 
lxxiv) the nucleotide sequence from position 595 to position 597 is not AAA; 
Ixxv) if the nucleotide sequence from position 598 to position 600 is ATT then the 

nucleotide^equencefromposition 601 to position 603 is not AAT; 
bcxvi) the nucleotide sequence from position 598 to position 600 is not ATA; 
btxvii) the nucleotide sequecne from position 601 to position 603 is not AAT; 
lxxviii)if the nucleotide sequence fiom position 604 to position 606 is AAA then the 

nucleotide sequence fiom position 607 to position 609 is not AAT; 
borix) the nucleotide sequence from position 607 to position 609 is not AAT; 
boot) the nucleotide sequence from position 613 to position 615 is not CCA; 
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lxxxi) if the nucleotide sequence from position 613 to position 615 is CCG, then the 

nucleotide sequence from position 616 to position 618 is not ATA; 
lxxxii) if the nucleotide sequence from position 616 to the nucleotide at position 618 

is ATA, then the nucleotide sequence from position 619 to 621 is not ATA; 
lxxxiii)if the nucleotide sequence from position 619 to position 621 is ATA, then the 

nucleotide sequence from position 622 to position 624 is not TAC; 
lxxxiv)fhe nucleotide sequence from position 619 to position 621 is not ATT; 
Ixxxv) the codons from the nucleotide sequence from position 640 to position 645 

are chosen according to the choices provided in such a way that the resulting 

nucleotide sequence does not comprise ATTTA; 
lxxxvQif the nucleotide sequence from position 643 to position 645 is TTA then the 

nucleotide sequence from position 646 to position 648 is not ATA; 
Ixxxvii) if the nucleotide sequence from position 643 to position 645 is CTA 

then the nucleotide sequence from position 646 to position 648 is not ATA; 
lxxxviif) the codons from the nucleotide sequence from position 655 to position 

660 are chosen according to the choices provided in such a way that the resulting 

nucleotide sequence does not comprise ATTTA; 
lxxxix)if the nucleotide sequence from position 658 to 660 is TTA or CTA then the 

nucleotide sequence from position 661 to position 663 is not ATT or ATC; 
xc) the nucleotide sequence from position 661 to position 663 is not ATA; 
xci) if the nucleotide sequence from position 661 to position 663 is ATT then the 

nucleotide sequence from position 664 to position 666 is not AAA; 
xcii) the codons from the nucleotide sequence from position 670 to position 675 

are chosen according to the choices provided in such a way that the resulting 

nucleotide sequence does not comprise ATTTA; 
xciii) if the nucleotide sequence from position 691 to position 693 is TAT then the 

nuclotide sequence from position 694 to position 696 is not AAA; 
xciv) if the nucleotide sequence from position 694 to position 696 is AAA then the 

nucleotide sequence from position 697 to position 699 is not TTG; 
xcv) if the nucleotide sequence from position 700 to position 702 is CCC then the 

nucleotide sequence from position 703 to position 705 is not AAT; 
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xcvi) if the nucleotide sequence from position 703 to position 705 is AAT then the 

nucleotide sequence from position 706 to position 708 is not ACA or ACT; 
xcvii) if the nucleotide sequence firom position 706 to position 708 is ACA then the 

nucleotide sequence from position 709 to 71 1 is not ATA simultaneously with the 

nucleotide sequence from position 712 to position 714 being AGY; 
xcviii) said nucleotide sequence does not comprise the codons TTA, CTA, ATA, 

GTA, TCG, CCG, ACG and GCG; 
xcix) said nucleotide sequence does not comprise a GC stretch consisting of 7 

consecutive nucleotides selected from the group of G or C; and 
c) said nucleotide sequence does not comprise a AT stretch consisting of 5 

consecutive nucleotides selected from the group of A or T. 

26. An isolated DNA fragment comprising the nucleotide sequence of SEQ ID No 3 wherein 
the GC content of said nucleotide sequence is about 50 to about 60%, provided that 

a) if the nucleotide sequence from position 121 to position 123 is GAG then the 
nucleotide sequence from position 124 to 126 is not CAA; 

b) if the nucleotide sequence from position 253 to position 255 is GAC then the 
nucleotide sequence from position 256 to 258 is not CAA; 

c) if the nucleotide sequence from position 277 to position 279 is CAT then the 
nucleotide sequence from position 280 to 282 is not AAA; 

d) if the nucleotide sequence from position 340 to position 342 is AAG then the 
nucleotide sequence from position 343 to position 345 is not CAT; 

e) if the nucleotide sequence from position 490 to position 492 is TCA then the 
nucleotide sequence from position 493 to position 495 is not ACC; 

f) if the nucleotide sequence from position 499 to position 501 is AAA then the 
nucleotide sequence from position 502 to 504 is not TCA or AGC; 

g) if the nucleotide sequence from position 517 to position 519 is ACC then the 
nucleotide sequence from position 520 to position 522 is not CAA simultaneous with, 
the nucleotide sequence from position 523 to 525 being TCN; 

h) if the nucleotide sequence from position 661 to position 663 is ATT then the 
nucleotide sequence from position 664 to position 666 is not AAA; 
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i) the codons from the nucleotide sequence from position 7 to position 15 are chosen 
according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of seven contiguous nucleotides from the group 
ofGorC; 

j) the codons from the nucleotide sequence from position 61 to position 69 are chosen 
according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of seven contiguous nucleotides 'from the group 
ofGorC; 

k) the codons from the nucleotide sequence from position 130 to position 138 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of seven contiguous nucleotides from the group 
ofGorC; 

1) the codons from the nucleotide sequence from position 268 to position 279 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of seven contiguous nucleotides from the group 
ofGorC; 

m) the codons from the nucleotide sequence from position 322 to position 333 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of seven contiguous nucleotides from the group 
ofGorC; 

n) the codons from the nucleotide sequence from position 460 to position 468 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of seven contiguous nucleotides from the group 
ofGorC; 

o) the codons from the nucleotide sequence from position 13 to position 27 are chosen 
according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides from the group 
of A or T; 

p) the codons from the nucleotide sequence from position 37 to position 48 are. chosen 
according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides from the group 
of A or T; 
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q) the codons from the nucleotide sequence from position 184 to position 192 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides ftorn the group 
of A or T; 

r) the codons from the nucleotide sequence from position 214 to position 219 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides from the group 
of A or T; 

s) the codons from the nucleotide sequence from position 277 to position 285 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides from the group 
of A or T; 

t) the codons from the nucleotide sequence from position 388 to position 396 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides from the group 
of A or T; 

u) the codons from the nucleotide sequence from position 466 to position 474 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides from the group 
of A or T; 

v) the codons from the nucleotide sequence from position 484 to position 489 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides from the group 
of A or T; 

w) the codons from the nucleotide sequence from position 571 to position 576 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides from the group 
of A or T; 

x) the codons from the nucleotide sequence from position 598 to position 603 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides from the group 
of A or T; 
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y) the codons from the nucleotide sequence from position 604 to position 609 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides from the group 
of A or T; 

z) the codons from the nucleotide sequence from position 613 to position 621 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides from the group 
of A or T; 

aa) the codons from the nucleotide sequence from position 646 to position 651 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides from the group 
of A or T; 

bb)the codons from the nucleotide sequence from position 661 to position 666 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides from the group 
of A or T; and 

cc)the codons from the nucleotide sequence from position 706 to position 714 are 
chosen according to the choices provided in such a way that the resulting nucleotide 
sequence does not comprise a stretch of five contiguous nucleotides from the group 
of A or T. 

27. An isolated DNA sequence according to claim 26, characterized in that it contains a 
nucleotide sequence differing from the nucleotide sequence of SEQ ID No 4 in only one 
position. 

28. An isolated DNA sequence according to claim 26, characterized in that it contains a 
nucleotide sequence differing from the nucleotide sequence of SEQ ID No 4 in only ten 
positions. 

29. An isolated DNA sequence comprising the nucleotide sequence of SEQ ID No 4. 
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30. A chimeric gene comprising the isolated DNA fragment according to any one of claims 
24 to 29 operably linked to a plant-expressible promoter. 



31. Use of a chimeric gene according to claim 30 to insert a foreign DNA into an I-Scel 
recognition site in the genome of a plant 
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ABSTRACT 

Improved methods are provided for the directed introduction of a foreign DNA 
fragment at a preselected insertion site in the genome of a plant by introducing a 
double stranded DNA break at the preselected insertion site and providing repair 
DNA. 
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SEQUENCE LISTING 
<110> Bayer Bioscience N.V. 

<120> Methods and means for improved targeted DNA insertion in plants 
<130> BCS 03 2007 
<160> 7 

<170> Patentln version 3.1 



<210> 1 

<211> 244 

<212> PRT 

<213> Saccharomyces cerevisiae 



<400> 1 



Met ala Lys Pro 


Pro Lys 


Lya Lys Arg 


Lys Val 


Asn 


He 


Lys 


Lys 


Asn 


1 


5 




10 








15 




Gin Val Met Asn 


Leu Gly 


Pro Asn Ser 


Lys Leu 


Leu 


Lys 


Glu 


Tyr 


Lys 


20 




2S 








30 




Ser Gin Leu lie 


Glu Leu 


Asn He Glu 


Gin Phe 


Glu 


Ala 


Gly 


He 


Gly 


35 




40 






45 






Leu lie Leu Gly 


Asp Ala 


Tyr lie Arg 


Ser Arg 


Asp 


Glu 


Gly 


Lys 


Thr 


50 




55 




60 










Tyr Cys Met Gin 


Phe Glu 


Trp Lys Asn 


Lys Ala 


Tyr 


Met 


Asp 


His 


Val 


65 


70 




75 










80 . 


Cys Leu Leu Tyr 


Asp Gin 


Trp Val Leu 


Ser Pro 


Pro 


His 


Lys 


Lys 


Glu 




85 




90 








95 




Arg Val Asn His 


Leu Gly 


Asn Leu Val 


He Thr. 


Trp 


Gly 


Ala 


Gin 


Thr 


100 




105 








110 






Plie Lys His Gin 


Ala Phe 


Asn Lys Leu 


Ala Asn 


Leu 


Phe 


He 


Val 


Asn 


115 




120 






125 








Asn Lys Lys Thr 


He Pro 


Asn Asn Leu 


Val Glu 


Asn 


Tyr 


Leu 


Thr 


Pro 


130 




135 




140 










Met Ser Leu Ala 


Tyr Trp 


Phe Met Asp 


Asp Gly 


Gly 


Lys 


Trp 


Asp 


Tyr 


145 


150 




155 










160 


Asn Lys Asn Ser 


Thr Asn 


Lys Ser He 


val Leu 


Asn 


Thr. 


Gin 


Ser 


Phe 




165 




170 








175 




Thr Phe Glu Glu 


Val Glu 


Tyr Leu Val 


Lys Gly 


Leu 


Arg 


Asn 


Lys 


Phe 


180 




185 








190 






Gin Leu Asn Cys 


Tyr Val 


Lys. He Asn 


Lys Asn 


Lys 


Pro 


He. 


He 


Tyr 


155 




200 






205 






lie Asp Ser Met 


Ser Tyr 


Leu He Phe 


Tyr Asn 


Leu 


He 


Lys 


Pro 


Tyr 


210 




215 




220 










Leu He Pro Gin 


Met Met 


Tyr Lys Leu 


Pro Asn 


Thr 


He 


Ser 


Ser 


Glu 


225 


230 




235 










240 



Thr Phe Leu Lys 



<210> 2 

<211> 732 

<212> DNA 

<213> Artificial sequence 
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<223> synthetic DNA. sequence encoding I~SceI (UIPAC code) 
<220> 

<221> misc_feature 

<222> (6) -.(6) 

<223> W= A, G, C or T 

<220> 

<221> variation 

<222> (25),. (27) 

<223> AGR 

<220> 

<221> variation 

<222> (61)-. (63) 

<223> TTR 

<220> 

< 2 2 1 > var iat ion 

<222> (73) * . (75) 

<223> AGY 

<220> 

<221> variation 

<222> (79) . - (81) 

<223> TTR 

<220> 

<221> variation 
<222> (82).. (84) 

<223> TTR 

<220> 

<221> variation 

<222> (97).. (99) 

<223> AGY 

<220> 

<22X> variation 

<222> (103) . - (105) 

<223> TTR 

<220> 

<221> variation 

<222> (112) * . (114) 

<223> TTR 

<220> 

<221> variation 

<222> (145) . . (147) 

<223> TTR 
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<220> 

<221> variation 

<222> (151) . , (153) 

<223> TTR 

<220> 

<221> variation 

<222> (169) . . (171) 

<223> AGR 

<220> 

<221> variation 

<222> (172) . - (174) 

<223> AGY 

<220> 

<221> variation 

<222> (175) . • (177) 

<223> AGR 

<22 0> 

<221> variation 

<222> (244) „ . (246) 

<223> TTR 

<220> 

<221> variation 

<222> (247) . . (249) 

<223> TTR 

<220> 

<221> variation 

<222> (265) . . (267) 

<223> TTR 

<220> 

<221> variation 

<222> (268) . . (270) 

<223> AGY 

<220> 

<221> variation 

<222> (289) . . (291) 

<223> AGR 

<220> 

<221> variation 
<222> (301) . . (303) 

<223> TTR 

<220> 

<221> variation 

<222> (310) . . (312) 

<223> TTR 



<220> 
<221> 



variation 
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<222> (361) . . (363) 
<223> TTR 

<220> 

<221> variation 

<222> (370) . . (372) 

<223> TTR 

<220> 

<22 X> variation 

<222> (409) . . (411) 

<223> TTR 

<220> 

<221> variation 

<222> (424) . . (426) 

<223> TTR 

<220> 

<221> variation 

<222> (436) . . (438) 

<223> AGY 

<220> 

<221> variation 

<222> (439) . . (441) 

<223> TTR 

<220> 

<221> variation 

<222> (490) . . (492) 

<223> AGY 

<220> 

<221> variation 

<222> (502) . . (504) 

<223> AGY 

<220> 

<221> variation 

<222> (511) . . (513) 

<223> TTR 

<220> 

<221> variation 

<222> (523) . . (525) 

<223> AGY 

<2_2_0> 

<221> variation 

<222> (550) . . (552) 

<223> TTR 

<220> 

<221> variation 

<222> (562) . . (564) 

<223> TTR 



030 18-11.2003 



72 



<220> 

<22l> variation 

<222> (565) * . (567) 

<223> AGR 

<220> 

<221> variation 

<222> (530) . . (582) 

<223> TTR 

<220> 

<22l> variation 

<222> (631) . . (633) 

<223> AGY 

<220> 

<221> variation 

<222> (637) (639) 

<223> AGY 

<220> 

<221> variation 

<222> (643) . . (645) 

<223> TTR 

<220> 

<221> variation 

<222> (658) - . (660) 

<223> TTR 

<220> 

<221> variation 

<222> (673) . . (675) 

<223> TTR 

<220> 

<221> variation 

<222> (697) . . (699) 

<223> TTR 

<220> 

<221> variation 

<222> (712) . . (714) 

<223> AGY 

<220> 

<221> variation 

<222> (715) - . (717) 

<223> AGY 

<220> 

<22l> variation 

<222> (727) . . (729) 

<223> TTR 
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<220> 

<221> ndsc_feature 

<222> (12).. (12) 

<223> KT= A, G, C or T 

<220> 

<221> inisc — feature 

<222> (15).. (15) 

<223> N= A, G, C or T 

<220> 

<22X> raise feature 

<222> (27)7. (27) 
<223> N» A, G, C or T 

<220> 

<221> misc_feature 

<222> (33).. (33) 

<223> N= A/ G, C or T 

<220> 

<22X> misc_feature 

<222> (54) -.(54) 

<223> tf= A, G, C or T 

<220> 

<221> m±ssc_feature 

<222> (63).. (63) 

<223> N= A, G, C or T 

<220> 

<22X> Tnisc__feature 

<222> (66) . . (66) 
<223> A, G, C or T 

<220> 

<221> misc_feature 

<222> (69).. (69) 

<223> N= A, G, C or T 



<220> 

<221> raisc_j£eature 

<222> (75) (75) 

<223> N« A, G, C or T 



<220> 

<221> misc_feature 

<222> (81) (81) 

<223> K= A, G, C or T 



<220> 

<221> raisc_feature 

<222> (84).. (84) 

<223> N= A, G, C or X 
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<220> 

<22l> misc_feature 

<222> (99).. (99) 

<223> N= A, G, C or T 



<220> 

<22l> mis cofeature 

<222> (10s7* * (105) 

<223> N- A, G, C or T 



<220> 

<22l> misc_feature 

<222> (114) . . (114) 

<223> N=* A, G, C or T 



<220> 

<221> miscJEeafcure 

<222> (135) . . (135) 

<223> N« A, G, C or T 



<220> 

<221> mi ©cofeature 
<222> (138) . . (138) 

<223> 2&= A, G, C or T 



<220> 

<221> misc_f eatiure 

<222> (144) . . (144) 

<223> N« A, G, C or T 



<220> 

<221> mis cofeature 

<222> (147) . . (147) 

<223> Na A, G, C or T 



<220> 

<221> misc^feature 

<222> (153K . (153) 

<223> N= A, 6, C or T 



<220> 

<2 2 1 > mis cofeature 

<222> (156) . . (156) 

<223> SF= A, G, C or T 



<220> 

<221> misc_feature 

<222> (162) . . (162) 

<223> N« A, G, C or T 
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<220> 

<22l> misc_feature 

<222> (171) . . (171) 

<223> N= A f G, C or T 



<220> 

<221> . misc_£eature 
<222> (174) . - (174) 
<223> N= A, G, C or T 



<220> 

<22l> misc_feature 

<222> (177) . . (177) 

<223> N= A, G, C or T 

<220> 

<22 1> misc_f eature 
<222> (186) . . (185) 

<223> N= A, G, C or T 



<220> 

<22l> misc^feature 

<222> (192*)". . (192) 

<223> N= A/ G, C or T 



<220> 

<221> misc_feature 

<222> (225) . . (225) 

<223> N= A, G, C or T 



<220> 

<221> misc_f eature 

<222> (240) . . (240) 

<223> N= A, G, C or T 



<220> 

<221> miscjf eature 

<222> (2467- - (246) 

<223> Efss A, G, C or T 



<220> 

<22l> misc_f e&ture 

<222> (249) . . (249) 

<223> N=* A, G, C or 



<220> 

<221> mis cofeature 

<222> (264) - . (264) 

<223> N« A, G, C or T 
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<220> 

<22l> misc^feature 

<222> (267) . . (267) 

<223> tf= A, G, C or T 



<220> 

<221> raisc_f eature 

<222> (270) . . (270) 

<223> N= A, 6, C or T 



<220> 

<221> mi sc_f eature 

<222> (273) . . (273) 

<223> N= A, G, C or T 



<220> 

< 2 2 1 > mi a c_f eat ure 

<222> (276) . . (276) 

<223> No A, G, C or T 



<220> 

<221> mis cofeature 

<222> (291) . . (291) 

<223> N= A r G, C or T 



<220> 

<221> misc_f eature 

<222>. (294) , , (294) 

<223> N= A, G, C or T 



<220> 

< 2 2 1> mi sc_f eature 

<222> (303) . . (303) 

<223> Ns A, G, C or T 



<220> 

<221> misc_feature 

<222> (306) . - (306) 

<223>. tf= A, G, C or T 



<220> 

<22l> misc_f eature 

<222> (312) - . (312) 

<223> N= A, G, C or T 



<220> 
<221> 



misc_feature 
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<222> (315) . . (315) 
<223> N= A, G, C or T 



<220> 

<2 2 1> mi soJE eature 

<222> (321) . . (321) 

<223> N« A, G, C or T 



<220> 

<2 2 1> mis c_f eature 

<222> (327) (327) 

<223> N= A, G, C or T 



<220> 

<221> miecjEeature 

<222> (330) . . (330) 

<223> N— A, G, C or T 



<220> 

<221> misc_feature 

<222> (336) . . (336) 

<223> N= A / G, C or T 



<220> 

<221> misc_£eature 
<222> (351) . . (351) 

<223> N= A, G, C or T 



<220> 

<221> miscjEeature 

<222> (363) (363) 

<223> N=* A, G, C or T 



<220> 

<221> mi BC_f eature 

<222> (366) . . (366) 

<223> N« A, G, C or T 



<220> 

<22l> misc^feature 

<222> (372) (372) 

<223> N= A, G, C or T 

<220> 

<221> misc feature 

<222> (38lT..(381) 

<223> N~ A/ G, C or T 
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<220> 

<221> misc_feature 

<222> (396) . , (396) 

<223> N= A r Gj C or T 



<220> 

<221> misc — feature 

<222> (402) . . (402) 

<223> N« A, G, C or T 



<220> 

<221> misc_feature 

<222> (411) . , (411) 

<223> N= A, G, C or T 



<220> 

<22l> misc_£eafcure 

<222> (414) . . (414) 

<223> N= A, G, C or T 



<220> 

<221> misc feature 

<222> (4267. . (426) 

<223> N= A/ G, C or T 



<220> 

<22l> misc_feafcure 
<222> (429) . . (429) 

<223> N= A, G, C or T 



<220> 

<221> raisc_feature 

<222> (432) . . (432) 

<223> N» A, G, C or T 



<220> 

<221> mis cofeature 

<222> (438) . . (438) 

<223> tt= A, G, C or T 



<220> 

<22 1 > mi eaJE eature 

<222> (441) . . (441) 

<223> Nc A, G, C ox V 



<220> 

<221> miscjCeature 
<222> (444) . . (444) 
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<223> N= A, G, C or T 



<220> 

<221> miec_£eature 

<222> (465) . . (465) 

<223> N- A, G, C or T 



<220> 

<22l> misc_feature 

<222> (468) . - (468) 

<222> N« A, G, C or T 



<220> 

<221> misc_feature 

<222> (492) . . (492) 

<223> N= A, G/ C or T 



<220> 

<221> misc^feature 

<222> (4957- . (495) 

<223> N= A, G, C or T 



<220> 

<221> ™isc_f eature 

<222> (504) . ♦ (504) 

<223> N= A, G, C or T 



<220> 

<221> misc_feature 

<222> (510) . - (510) 

<223> N= A, G, C or T 



<220> 

<221> misc_feature 

<222> (513) ^(513) 

<223> N= A f G, C or T 



<220> 

<221> tnisc^feature 

<222> (519)\.(519) 

<223> N~ A, G, C or 5 



<220> 

<221> misc_feature 

<222> (525) (525) 

<223> W= A, G, C or T 
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<220> 

<221> misc_feature 

<222> (531) . . (531) 

<223> N= A, G, C or T 



<220> 

<221> mis cofeature 

<222> (543) . . (543) 

<223> N=» A, G, C or T 



<220> 

<221> mi s c_f e atur e 

<222> (552) . . (552) 

<223> N= A, G, C or T 



<220> 

<22i> misc_feature 

<222> (555) , . (555) 

<223> N=» A, G, C or T 

<220> 

<22X> misc_£eature 

<222> (561) . * (551) 

<223> N= A, G, C or T 



<220> 

<221> mi ©cofeature 

<222> (564) . . (564) 

<223> tT= A, G, C or T 



<220> 

<22l> roisc_j£ eature 

<222> (567) • „ (567) 

<223> N« A, G, C or T 

<220> 

<221> ntisc_f eature 

<222>. (582) . * (582) 

<223> N=* A, G, C or T 



<220> 

<22l> mis cofeature 

<222> (594) . . (534) 

<223> H« A, G, C or T 



<220> 

<221> mis cofeature 

<222> (6X5) . . (615) 

<223> N= A, G, C or T 
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<220> 

<221> misc_£eature 

<222> (633) . • (633) 

<223> N= A, O r C or 



<220> 

<22l> misc_feature 

<222> (639) . . (639) 

<223> N- A, G, G or T 

<220> 

<221> miscjEeature 

<222> (645) - . (645) 

<223> N» A, G, C or T 

<220> 

<221> misc_feature 

<222> (660) . . (660) 

<223> N= A r G, C or T 

<220> 

<221> misc__feature 
<222> (669) - . (669) 

<223> N=* A, G, C or T 

<220> 

<221> miec_feafcure 
<222> (675) ► . (675) 

<223> N= A, G, C or T 

<220> 

<22l> misc_£eature 
<222> (681) . - (601) 
<223> N« A, G, C or T 



<220> 

<221> misc^feature 

<222> (699) . . (699) 

<223> M= A, G f C or T 

<220> 

<221> mis cofeature 

<222> (702) . , (702) 

<223> N= A, G, C or T 

<220> 

<221> wisc_feature 

<252> (703) . . (708) 

<223> N= A/ G, C or T 



<220> 

<221> mis cofeature 

<222> (714) (714) 

<223> N=* A, G, C or 
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<220> 

<221> misc^feature 

<222>. (7X7) • . (717) 

<223> Na A, 6, C or T 



<220> 

<221> misc_f eature 

<222> (723) . - (723) 

<223> N« A, G, C or T 



<220> 

<221> misc_feafcure 

<222> (729) , . (729) 

<223> tf= A, G, C or T 



<400> 2 

atggcnaarc cnccnaaraa raarcgnaar gtnaayatha araaraayca rgtnatgaay 60 

cfcnggnccna ayccnaarct nctnaargar tayaartcnc arctnathga rcfcnaayath. 120 

garcaxttyg argcnggnat hggnctnath cfcnggngayg cntayathcg ntcncgngay 180 

garggnaara cntaytgyat gcarttygar tggaaraaya argcntayat ggaycaygtn 240 

tgyctnctnt aygaycartg ggtnctntcn ccnccncaya araargarcg ngtnaaycay 3 00 

ctnggnaayc tngtnatnac ntggggngcn caracnttya arcaycargc nttyaayaar 360 

ctngcnaayc tnttyathgt naayaayaar aaracnathc cnaayaayct ngtngaraay 420 

tayctnacnc cnatgtcnet ngcntayfcgg tfcyatggayg ayggnggnaa rtgggaytay 480 

aayaaraayt cnacnaayaa rtcnathgtn ctnaayacnc axtcntfcyac nttygargar 540 

gfcngartayc tngtnaargg nctncgnaay aarttycarc tnaaytgyta ygtnaaratn 600 

aayaaraaya arccnathat ntayathgay tcnatgtcnt ayctnatntt ytayaayctn 660 

athaarccnt ayctnatncc ncaratgatg tayaarctnc cnaayacnat ntcntcngar 720 

acnfctyctna ar 732 

<210> 3 
<211> 732 
<212> DNA 

<213> artificial sequence 



<220> 

<223> preferred synthetic DNA sequence encoding I-Scel (UIPAC code) 

<220> 

<2 2 1> variation 

<222> (25).. (27) 

<223> AGA 



<220> 

<221> variation 

<222> (73).. (75) 

<223> AGC 

<220> 

< 2 2 1 > var ia t ion 
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<222> (97) . . (99) 

<223> AGC 

<220> 

<221> variation 

<222> (169) . - (171) 

<223> AGA 



<220> 

<22X> variation 

<222> (172) . . (174) 

<223> AGC 



<220> 

<221> variation 

<222> (175) . . (177) 

<223> AGA 



<220> 

<221> variation 

<222> (268) . - (270) 

<223> AGC 



<220> 

<221> variation 

<222> (289) . . (291) 

<223> AGA 

<220> 

<221> variation 

<222> (436) . „ (438) 

<223> AGC 



<220> 

<221> variation 

<222> (490) . „ (492) 

<223> AGC 

<220> 

<221> variation 

<222> (502) . . (504) 

<223> AGC 



<220> 

<221> variation 

<222> (523) . . (525) 

<223> AGC 



<220> 

<221> variation 
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<222> (565) - ♦ (567) 
<223> AGA 



<220> 

<221> variation 

<222> (631) . - (632) 

<223> AGC 



<220> 

<221> variation 

<222> (637) . . (639) 

<223> AGC 



<220> 

<221> variation 

<222> (7X2) . • (714) 

<223> AGC 



<220> 

<221> variation 

<222> (715) - - (717) 

<223> AGC 



<400> 3 

atggcyaarc 

ctsggmccha 

garcarttcg 

garggmaara 

tgyctsctst 

cteggmaacc 

ctsgcsaacc 

taccteacyc 

aacaaraact 

gtsgartacc 

aacaaraaca 

atyaarccht 

acyttcctsa 



chcchaaraa 
actcmaarct 
argcyggmat 
cytactgyat 
acgaycartg 
tsgtsatyac 
tsttcatyct 
cyatgtcmct 
cmacyaacaa 
tsgtsaargg 
arccyatyat 
acctsatycc 
ar 



raarcgsaaa 
sctsaargag 
cggmctsaty 
gcagttcgar 
ggtsctstcm 
ytggggmgcy 
saacaacaar 
sgcytactgg 
rtcmatygts 
mctacgsaac 
ctacatygay 
hcaratgatg 



gtsaacatya 
tacaartomc 
ctsggmgayg 
tggaaraaca 
cchcchcaya 
caracyttca 
aaracyatyc 
ttcatggayg 
ctsaacacyc 
aarttccarc 
tcmatgtcmt 
tacaarctsc 



araaxaacca 
arctsatyga 
cytacatycg 
argcytacat 
araargarcg 
arcaycargc 
chaacaacct 
aygginggmaa 
artcmttcac 
teaactgyta 
acctsatytt 
chaacacyat 



ggtsatgaac 
rctsaacaty 
stcmcgsgay 
ggaycaygts 
agtsaaccay 
yttcaacaar 
agtsgaraac 
rtgggaytac 
yttcgargar 
cgtsaagaty 
ctacaaccts 
ytcmtcmgar 



<210> 4 

<211> 732 

<212> DNA 

<213> artificial sequence 



60 
120 
180 
240 
300 
360 . 
420 
480 
540 
600 
660 
720 
732 



<220> 

<223> preferred synthetic DNA sequence encoding I-Scel (UIPAC code) 
<400> 4 

atggccaagc ctcccaagaa gaagcgcaaa gtgaacatca agaagaacca ggtgatgaac 60 
ctgggaccta acagcaagct cctgaaggag tacaagagcc agctgatcga actgaacatc 120 
gagcagttcg aagctggcat cggcctgatc ctgggcgatg cctacatcag atcccgggac 180 
gaaggcaaga cctactgcat gcagttcgag tggaagaaca aggcctacat ggaccacgtg 240 
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tgtctgctgt acgaccagtg ggtcctgagc cct.cctcaca agaaggagcg cgtgaaccat 300 

ctgggcaacc tcgtgatcac ctggggagcc cagaccttca agcaccaggc cttcaacaag 360 

ctggccaacc tgttcatcgt gaacaacaag aagaccatcc ccaacaacct cgtggagaac 420 

tacctcactc ccatgagcct ggcctactgg ttcatggacg acggaggcaa gtgggactac 480 

aacaagaaca gcaccaacaa gtcaattgtg ctgaacaccc aaagcttcac cttcgaagaa 540 

gtggagtacc tcgtcaaggg cctgcgcaac aagttccagc tgaactgcta cgtgaagatc 600 

aacaagaaca agcctatcat ctacatcgac agcatgagct acctgatctt ctacaacctg 660 

atcaagccat acctgatccc tcagatgatg tacaagctgc ccaacaccat cagcagcgag 720 

accttcctga ag 732 



<210> 5 
<211> .3262 
<212> DNA 

<213> Artificial sequence 



<220> 

<223> T-DNA of pTTAM78 (target locus) 
<220> 

<221> misc_feature 

<222> (1).7(25) 

<223> right T-DNA border sequence 



<220> 

<221> misc_feature 

<222> (26) - . (72) 

<223> synthetic polylinker sequence 



<220> 

<221> misc_feature 

<222> (333) . . (73) 

<223> 3 1 nos 



<220> 

<221> misc_feature 

<222> (334) . . (351) 

<223> synthetic polylinker sequence 



<220> 

<221> miscjceature 

<222> (352) . . (903) 

<223> bar sequence (complement) 



86 



<220> 

<221> misc_feature 

<222> (904) . . (928) 

<223> synthetic polylinker sequence 



<220> 

<221> misc^feature 

<222> (929)\ - (946) 

<223> I-Scel recognition site 



<220> 

<221> miscjEeature 

<222> (947) . . (967) 

<223> synthetic polylinker sentence 



<220> 

<221> misc_feature 

<222> (968) . . (1171) 

<223> 3 1 37 



<220> 

<22l> raisc_£eature 

<222> (1172) . . (1290) 

<223> synthetic polylinker sequence 



<220> 

<221> tnisc_feature 

<222> (1291) . • (1577) 

<223> promoter nopaline synthetase gene 



<220> 

<221> misc_feature 

<222> (1578) . . (1590) 

<223> synthetic polylinker sequence 



<220> 

<221> tnisc__feature 

<222> (1591) . . (2394) 

<223> nptll 
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<220> 

<221> misc_feature 

<222> (2395) . . (2567) 

<223> 3' neo 



<220> 

<221> misc_feature 

<222> (2568) , . (3183) 

<223> 3' oca 



<220> 

<221> misc_feature 

<222> (3184) . ■ (3234) 

<223> synthetic polylinker sequence 



<220> 

<221> misc_feature 

<222> {3235) . . (3262) 

<223> left T-DNA border sequence 



<400> 5 

aattacaacg gtatafcatcc tgceagtact cggccgtcga cctgcaggca attggtacct 60 

agaggatctt cccgatctag taacatagat gacaccgcgc gcgataattt atcctagttfc 120 

gcgcgctatcL ttttgttttc tatcgcgtat taaatgtata attgcgggac tctaatcata 180 

aaaacccatc tcataaataa cgtcatgcat tacatgttaa ttattacatg cfctaacgtaa 240 

ttcaacagaa attatatgat aatcatcgca agaccggcaa caggattcaa tcttaagaaa 300 

ctttattgcc aaatgtttga acgatctgct fccggatccfca gacgcgtgag atcagatctc 360 

ggtgacgggc aggaccggac ggggeggtac cggcaggctg aagtccagct gccagaaacc 420 

cacgtcatgc cagttcccgt gcttgaagcc ggccgcccgc agcatgccgc ggggggcata 480 

tccgagcgcc tcgtgcatgc gcacgctcgg gfccgttgggc agcccgatga cagcgaccac 540 

gctcttgaag ccctgtgcct ccagggactt cagcaggtgg gtgtagagcg tggagcccag 600 

tcccgtccgc tggtggcggg gggagacgta cacggtcgac tcggccgtcc agtcgtaggc 660 

gttgcgtgcc ttccaggggc ccgcgtaggc gatgccggcg acctcgccgt ccacctcggc 720 

gacgagccag ggatagcgct cccgcagacg gacgaggtcg tccgtccact cctgcggttc 780 

ctgcggctcg gtacggaagt tgaccgtgct tgfcctcgatg tagtggttga cgatggtgca 840 

gaccgccggc atgtccgcct cggtggcacg gcggatgtcg gccgggcgtc gttctgggtc 900 

catggttata gagagagaga tagatttaat taccctgtta tccctaggcc gctgtacagg 960 

Qcccgggatc ttgaaagaaa tatagtttaa atatttattg ataaaataac aagtcaggta 1020 

ttatagtcca agcaaaaaca taaatttatt gatgcaagtt taaattcaga aatatttcaa 1080 

taactgatta tatcagctgg tacattgccg tagatgaaag actgagtgcg atattatgtg 1140 

taatacataa attgatgata tagctagctt aggcgcgcca tagatcccgt caattctcae 1200 

tcat;t ca ccccaggctt tacactttat gcttccggct cgtataatgt gtggaattgt 1260 

gagcggataa caatttcaca caggaaacag gatcatgagc ggagaattaa gggagtcacg 1320 

ttatgacccc cgccgatgac gcgggacaag ccgttttacg tttggaactg acagaaccgc 13 80 

aacgattgaa ggagccactc agccgcgggt fctctggagtt taatgagcta agcacatacg 1440 

tcagaaacca ttattgcgcg ttcaaaagtc gcctaaggtc actatcagct agcaaatatc 1500 

tcttgtcaaa aatgctccac tgacgttcca taaattcccc tcggtatcca attagagtct 1560 

catattcact ctcaatcaaa gatccggccc atgatcatgt ggattgaaca agatggattg 1620 

cacgcaggtt ctccggccgc ttgggtggag aggctattcg gctatgactg ggcacaacag 1680 
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acaateggct gctctgatgc cgccgtgttc cggctgtcag cgcaggggcg cccggttctt 1740 

tttgtcaaga ccgacctgtc cggtgccctg aatgaactgc aggacgaggc agcgcggcta 180O 

fccgtggctgg ccacgacggg cgttccttgc gcagctgtgc tcgacgttgt cactgaagcg i860 

ggaagggacfc ggctgctatt gggcgaagtg ecggggcagg atctcctgtc atctcacctt 1920 

gctcctgccg agaaagtatc catcatggct gatgcaatgc ggcggctgca tacgcttgat 1980 

ccggctacct gcccattcga ccaccaagcg aaacatcgca tcgagcgagc acgtactcgg 2040 

atggaagccg gtcfctgtcga tcaggatgat ctggacgaag agcatcaggg gctcgcgcca 2100 

gccgaactgt tcgccaggct caaggcgcgc atgcccgacg gcgaggatct cgtcgtgacc 2160 

catggcgatg cctgcttgcc gaatatcatg gtggaaaatg gccgcttttc tggattcatc 2220 

gactgtggcc ggctgggtgt ggcggaccga tatcaggaca tagcgttggc tacccgtgat 22 80 

attgctgaag agcttggcgg cgaatgggct gaccgcttcc tcgtgcttta cggtatcgcc 2340 

gctcccgatt cgcagcgcat cgccttctat cgccttcttg acgagttctt ctgagcggga 2400 

ctctggggtt cgaaatgacc gaccaagcga cgcccaacct gccatcacga gatttcgatt 2460 

ccaccgccgc cttctatgaa aggttgggct tcggaatcgt tttccgggac gccggctgga 2520 

tgatcctcca gcgcggggat ctcatgetgg agttcttcgc ccaccccctg ctttaatgag 2580 

atatgcgaga cgcctatgat cgcatgatat ttgctttcaa ttctgttgtg cacgttgtaa 2640 

aaaacctgag catgtgtagc tcagatcctt accgccggtt * tcggttcatt ctaatgaafca 2700 

tatcacccgt tactatcgta tttttatgaa taatattctc cgttcaattt actgattgta 2760 

ccctactact tatatgtaca atattaaaat gaaaaeaata tattgtgctg aataggttta 2820 

tagcgacatc tatgatagag cgccacaata acaaacaatt gcgttttatt attacaaatc 2880 

caattttaaa aaaagcggca gaaccggtca aacctaaaag actgattaca taaatcttat 294 0 

teaaatttca aaaggcccca ggggctagta tctacgacac accgagcggc gaactaataa 3000 

cgttcactga agggaactcc ggttccccgc cggcgcgcat gggtgagatt ccttgaagtfc 3060 

gagtattggc cgtccgctct accgaaagtt acgggcacca ttcaacccgg tccagcacgg .3120 

cggccgggta accgacfctgc tgccccgaga attatgcagc atttttttgg tgtatgtggg 3180 

ccctgtacag cggccgcgtt aacgcgtata ctctagagcg atcgccatgg agccatttac 3240 

aattgaatat atcctgccgc eg 3262 



<210> 6 

<211> 5345 

<212> DKA 

<213> Artificial sequence 



<220> 

<223> the T-DNA of pTTAS 2 (repair DNA) 
<220> 

<22l> misc__f eature 

<222> (1) . . (25) 

<223> right T-DNA border sequence 



<220> 

<221> miac__f eature 

<222> (26). ,(62) 

<223> synthetic polylinker sequence 



<220> 

<2 2 1> mi sc_f eature 

<222> (63).. (578) 

<223> bar (3» deleted) (complement) 
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<220> 

<221> raisc_£eature 

<222> (579) . . (603) 

<223> synthetic polylinker sequence 



<220> 

<2 2 1 > mi s c_f eatur e 

<222> (504) (616) 

<223> partial I-Scel site 



<220> 

<221> mis cofeature 

<222> (142$) . . (617) 

<223> P35S3 (complement) 



<220> 

<221> misc_feature 

<222> (1430) . . (1438) 

<223> partial I-Scel site 



<220> 

<221> misc_feature 

<222> (1460) . - (1663) 

<223> 3 1 gene 7 



<220> 

<221> misc_feature 

<222> (1664) (1782) 

<223> synthetic polylinker sequence 



<220> 

<221> misc_feature 

<222> (1783) . . (2069) 

<223> promoter of the nopal ine synthetase gene 



<220> 

<221> miscJEeature 

<222> (2070) . . (2082) 

<223> synthetic polylinker sequence 
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<220> 

<221> misc_feature 

<222> (2083) . . (2886) 

<223> nptll 



<220> 

<221> mis cofeature 

<222> (2887) . . (3059) 

<223> 3» neo 



<220> 

<221> misc_£eature 

<222> (3060) - . (3675) 

<223> 3>OCS 



<220> 

<221> inisc_f eafcure 

<222> (3676) . . (3731) 

<223> synthetic polylinker sequence 



<220> 

<221> mi ©cofeature 

<222> (3732) • . (4246) 

<223> p35SS2 



<220> 

<2 2 1 > mi s cofeature 

<222> (4247) . . (4289) 

<223> AtSlBL 



<220> 

<221> mis cofeature 

<222> (4290) . . (4322) 

<223> NLS 



<220> 

<221> misc_feature 

<222> (4323) . . (5023) 

<223> I-Scel defective 
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<220> 

<221> misc_feature 

<222> (5024) - . (5260) 

<223> 3» 35S 



<220> 

<221> misc_£eature 

<222> (5261) . * (5317) 

<223> synthetic polylinker Bequence 



<220> 

<221> misc_f eature 

<222> (5318) - . (5345) 

<223> left T-DNA border sequence 



aattacaacg gtatatatcc tgccagtact cggccgtcga cctgcaggca attggtacga 60 

tcctagacgc gtgagatcag atcctgccag aaacccacgt cafcgccagtt cccgtgcttg 120 

aagccggccg cccgcagcat gccgcggggg gcatatccga gcgcctcgtg catgcgcacg 180 

ctcgggtcgfc tgggcagccc gatgacagcg adcacgctct tgaagccctg tgcctccagg 240 

gacttcagca ggtgggtgta gagcgtggag cccagtcccg tccgctggtg gcggggggag 3 00 

acgtacacgg tcgactcggc cgtccagtcg taggcgttgc gtgccttcca ggggcccgcg 360 

taggcgatgc cggcgacctc gccgtccacc tcggcgacga gccagggata gcgctcccgc 420 

agacggacga ggtcgtccgt ccactcctgc ggttcctgcg gctcggtacg gaagttgacc 480 

gtgcttgtct cgatgtagtg gttgacgatg gtgcagaccg ccggcatgtc cgcctcggtg 540 

gcacggcgga tgtcggccgg gcgtcgttct gggtccatgg ttatagagag agagatagat 60O 

ttaattaccc tgttattaga gagagactgg tgatttcagc gtgtcctctc caaatgaaat 660 

gaacttcctt atatagagga agggtcttgc gaaggatagt gggattgtgc gtcatcccfcfc 720 

acgtcagtgg agatgtcaca tcaatccact tgctttgaag acgtggttgg aacgtcttct 780 

ttttccacga tgctcctcgt gggtgggggt ccatctttgg gaccactgtc ggcagaggca 840 

tcttgaatga tagcctttcc tttatcgcaa tgatggcatt tgtaggagcc accttccttt 900 

tctactgtcc tttcgatgaa gtgacagata gctgggcaat ggaatccgag gaggtttccc 960 

gaaattatcc tttgttgaaa agtctcaata gccctttggt cttctgagac tgtatctttg 1020 

acatttttgg agtagaccag agtgtcgtgc tccaccatgt tgacgaagat tttcttcttg 1080 

tcattgagtc gtaaaagact ctgtatgaac tgttcgccag tcttcacggc gagttctgfcfc 1140 

agatcctcga tttgaatctt agactccatg catggcctta gattcagtag gaactacctt 1200 

tttagagact ccaatctcta ttacttgcct tggtttatga agcaagcctt gaatcgtcca 1260- 

tactggaata gtacttctga tcttgagaaa tatgtctttc tctgtgttct tgatgcaatt 1320 

agfccctgaat cttttgactg catctttaac cttcttggga aggtatttga tctcctggag 1380 

- attgttact:c gggtagatcg tcttgatgag acctgctgcg taggaacgct tatccctagg 1440 

ccgctgtaca gggcccggga tcttgaaaga aatatagttt aaatatttat tgataaaata 1500 

acaagtcagg tattatagtc caagcaaaaa cataaattta ttgatgcaag tttaaattca 1560 

gaaatatttc aataactgat tatatcagct ggtacattgc cgtagatgaa agactgagtg 1620 

cgatattatg tgtaatacat aaattgatga tatagctagc ttaggcgcgc catagatccc 16 BO 

gtcaattctc actcattagg caccccaggc tttacacttt atgcttccgg ctcgtataafc 1740 

gtgtggaatt gtgagcggat aacaatttca cacaggaaac aggatcatga gcggagaatt 1800 

aagggagtca cgttatgacc cccgccgatg acgcgggaca agccgtttta cgtttggaac 1860 

tgacagaacc gcaacgattg aaggagccac tcagccgcgg gtttctggag tttaatgagc . 1920 

taagcacata cgtcagaaac cattattgcg cgttcaaaag tcgcctaagg tcactatcag 1980 

ctagcaaata tttcttgtca aaaatgctcc actgacgttc cataaattcc cctcggtatc 2040 

caattagagt ctcatattca ctctcaatca aagatccggc ccatgatcat gtggattgaa 2100 

caagatggat tgcacgcagg ttctccggcc gcttgggtgg agaggctatt cggctatgac 2160 

tgggcacaac agacaatcgg ctgctctgat gccgccgtgt tccggctgtc agcgcagggg 2220 
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cgcccggttc 
gcagcgcggc 
gtcactgaag 
tcatctcacc 
catacgcttg 
gcacgtactc 
gggctcgcgc 
ctcgtcgtga 
tctggattca 
gctacccgtg 
tacggtatcg 
ttctgagcgg 
gagatttcga 
acgccggctg 
tgctttaatg 
tgcacgttgt 
ttctaatgaa 
ttactgattg 
tgaataggtt 
ttattacaaa 
cataaatctt 
gcgaactaat 
ttccttgaag 
ggtccagcac 

ggtgtatgtg 

atggagtcaa 
ttcatacaga 
cacgacactc 
attgagactt 
atctgtcact 
tgcgataaag 
cccccaccca 
gtggattgat 
caagaccctt 
aaaagaagaa 
ttaacatcaa 
acaaatccca 
tgggtgatgc 
ggaaaaacaa 
cgccgcacaa 
agacfcttcaa 
aaaccatccc 
tcatggatga 
gaacacccag 
attccaactg 
tatgtcttac 
caaactgccg 
acgctgaaat 
gtgtgagtag 
gcatataaga 
ctaattccta 
gttaacgcgt 
cgccg 



tttttgtcaa 
tatcgtggct 
cgggaaggga 
ttgctcctgc 
atccggctac 
ggatggaagc 
cagccgaact 
cccatggcga 
tcgactgtgg 
atattgctga 
ccgctcccga 
gactctgggg 
ttccaccgcc 
gatgatcctc 
agatatgcga 
aaaaaacctg 
tatatcaccc 
taccctacta 
tatagcgaca 
tccaatttta 
attcaaattt 
aacgttcact 
ttgagtatfcg 
ggcggccggg 
ggccctgtac 
aaattcagat 
gtcttttacg 
tcgtctactc 
ttcaacaaag 
tcatcaaaag 
gaaaggctat 
cgaggagcat 
gtgatatctc 
cctctatata 
gaagaagaag 
aaaaaaccag 
gctgatcgaa 
ttacatccgt 
agcatacatg 
aaaagaacgt 
acaccaagct 
gaacaacctg 
tggtggtaaa 
tctttcactt 
aactgttacg 
ctgatcttct 
aacactatct 
caccagtcfcc 
ttcccagata 
aacccttagt 
aaaccaaaat 
atactctaga 



gaccgacctg 
ggccacgacg 
ctggctgcta 
cgagaaagta 
ctgcccattc 
cggtcttgtc 
gttcgccagg 
tgcctgcttg 
ccggctgggt 
agagcttggc 
ttcgcagcgc 
ttcgaaatga 
gccttctatg 
cagcgcgggg 
gacgcctatg 
agcatgtgta 
gttactatcg 
cttatatgta 
tctatgatag 
aaaaaagcgg 
caaaaggccc 
gaagggaact 
gccgtccgcfc 
taaccgactfc 
agcggccgcg 
cgaggatcta 
actcaatgac 
caagaatatc 
ggtaatatcg 
gacagtagaa 
cgttcaagab 
cgtggaaaaa 
cactgacgta 
aggaagttca 
tccaaaacca 
gtaatgaacc 
ctgaacatcg 
tctcgtgatg 
gaccacgtat 
gttaaccacc 
ttcaacaaac 
gttgaaaact 
tgggattaca 
tcgaagaagt 
taaaaatcaa 
acaacctgat 
cctccgaaac 
tctctacaaa 
agggaattag 
atgtatttgt 
ccagtactaa 
gcgatcgcca 



tccggtgccc 
ggcgttcctt 
ttgggcgaag 
tccatcatgg 
gaccaccaag 
gatcaggatg 
ctcaaggcgc 
ccgaatatca 
gtggcggacc 
ggcgaatggg 
atcgccttct 
ccgaccaagc 
aaaggttggg 
atctcatgct 
atcgcatgat 
gctcagatcc 
tatttttatg 
caatattaaa 
agcgccacaa 
cagaaccggt 
caggggctag 
ccggttcccc 
ctaccgaaag 
gctgccccga 
ttaacgcgfca 
acagaactcg 
aagaagaaaa 
aaagatacag 
ggaaacctcc 
aaggaaggtg 
gcctctgccg 
gaagacgttc 
agggatgacg 
tttcatttgg 
tggctaaacc 
tgggtccgaa 
aacagttcga 
aaggtaaaac 
gtctgctgta 

tgggtaacct 

tggctaacct 
acctgacccc 
acaaaaactc 
agaatacctg 
caaaaacaaa 
caaaccgtac 
tttcctgaaa 
fcctabctctc 
ggttcctata 
atttgtaaaa 
aatccagatc 
tggagccatt 



tgaatgaact 
gcgcagctgt 
tgccggggca 
ctgatgcaat 
cgaaacatcg 
atctggacga 
gcatgcccga 
tggtggaaaa 
gctatcagga 
ctgaccgctt 
atcgccttct 
gacgcccaac 
cttcggaatc 
ggagttcttc 
atttgctttc 
ttaccgccgg 
aataatattc 
atgaaaacaa 
taacaaacaa 
caaacctaaa 
tatctacgac 
gccggcgcgc 
ttacgggcac 
gaattatgca 
tactctagta 
ccgtgaagac 
tcttcgtcaa 
tctcagaaga 
tcggattcca 
gcacctacaa 
acagtggtcc 
caaccacgtc 
cacaatccca 
agaggactcg 
ccccaagaag 
ctctaaactg 
agcaggtatc 
ctactgtatg 
cgatcagtgg 
ggtaatcacc 
gttcatcgtt 
gatgtctctg 
taccaacaaa 
gttaagggtc 
ccgatcatct 
ctgatcccgc 
tagggctagc 
tctattttct. 
gggtttcgct 
tacttctatc 
atgcatggta 
tacaattgaa 



gcaggacgag 
gctcgacgtt 
ggafcctcctg 
gcggcggctg 
catcgagcga 
agagcatcag 
cggcgaggat 
tggccgcttt 
catagcgttg 
cctcgtgctt 
tgacgagttc 
ctgccatcac 
gttttccggg 
gcccaccccc 
aattctgttg 
tttcggttca 
tccgttcaat 
tatattgtgc 
ttgcgtttta 
agactgatta 
acaccgagcg 
atgggtgaga 
cattcaaccc 
gcattttttt 
tgcaccatac 
tggcgaacag 
catggtggag 
ccaaagggct 
ttgcccagct 
atgccatcat 
caaagatgga 
ttcaaagcaa 
ctatccttcg 
agaattaagc 
aagcgcaagg 
ctgaaagaat 
ggtctgatcc 
cagttcgagt 
gtactgtccc 

tggggcgccc 

aacaacaaaa 
gcatactggt 
gtattgtact 
tgcgtaacaa 
acatcgattc 
agatgatgta 
aagcttggac 
ccataataat 
catgtgttga 
aataaaattt 
cagcggccgc 
tatatcctgc 



2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2680 

2 940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

402O 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

492 0 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5345 
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<210> 7 

<211> 4066 

<212> DWA 

<213> Artificial sequence 



<220> 

<223> pCV78 
<220> 

< 22 1> misc_f eature 

<222> (234) . . (763) 

<223> P35S2 promoter 

<220> 

<221> misc_feature 

<222> (764) . - (805) 

. <223> Atslb 1 



<220> 

<22l> misc_£eature 

<222> (808).- (839) 

<223> nuclear localization signal 



<220> 

<221> misc_f eature 

<222> (840) (1541) 

<223> l-Scel synthetic 



<220> 

<221> mis cofeature 

<222> (1544) . - (1792) 

<223> 3' 35S 



<220> 

<221> misc_feature 

<222> (3866) - . (3006) 

<223> Ampicillin resistance (complement) 



;^Lattt cqqtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 
tcgcgcgttt cggtg^ gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 
cagcttgtct ^aagcggat gc cggg g w ^ Hcagattgta ctgagagtgc 

SSSSS 2S5SS SacScg? aScatggcg cgccatatgc accatacatg 
tlltltl ttcagatcga ggatctaaca gaactcgccg tgaagactgg cgaacagttc 
gagtcaaaaa ttcagatcga 99 aagaaaatct tcgtcaacat ggtggagcac 

SSSSS SSSSS Satftcaaa gaLcagtct cagaagacca aagggctatt 
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gagacttttc aacaaagggt aatatcggga aacctcctcg gattccattg cccagctatc 480 

tgfccacttca tcaaaaggac agtagaaaag gaaggtggca cctacaaatg ccatcattgc 540 

gataaaggaa aggctatcgt tcaagatgcc tctgccgaca gtggfccccaa agatggaccc 600 

ccacccacga ggagcatcgt ggaaaaagaa gacgttccaa ccacgtcttc aaagcaagtg 660 

gattgatgtg afcatctccac tgacgtaagg gatgacgcac aatcccacta tccttcgcaa 720 

gacccttcct ctatataagg aagttcattt catttggaga ggactcgaga attaagcaaa 780 

agaagaagaa gaagaagtcc aaaaccatgg ccaagcctcc caagaagaag cgcaaagtga 840 

acatcaagaa gaaccaggtg atgaacctgg gacctaacag caagctcctg aaggagtaca 900 

agagccagct gatcgaactg aacatcgagc agttcgaagc tggcatcggc ctgatcatgg 960 

gcgatgccta catcagatcc cgggacgaag gcaagaceta ctgcatgcag ttcgagfcgga 1020 

agaacaaggc ctacatggac cacgtgtgtc tgctgtacga ccagtgggtc ctgagccatc 1080 

ctcacaagaa ggagcgcgtg aaccatctgg gcaacctcgt gatcacctgg ggagcccaga 1140 

ccttcaagca ccaggccttc aacaagctgg ccaacctgtt catcgfcgaac aacaagaaga 12 00 

ccatccccaa caacctcgtg gagaactacc tcactcccat gagcctggcc tacfcggttca 12 60 

tggacgacgg aggcaagtgg gactacaaca agaacagcac caacaagtca attgtgctga 1320 

acacccaaag cttcaccttc gaagaagtgg agtaccfccgt caagggcctg cgcaacaagt 1380 

tccagctgaa ctgctacgtg aagatcaaca agaacaagcc tatcatctac atcgacagca 1440 

tgagctacct gatcttctac aacctgatca agccatacct gatccctcag atgatgtaca 1500 

agctgcccaa caccatcagc agcgagaect tccfcgaagtg aggctagcaa gcttggacac 1560 

gctgaaatca ccagfccfcctc tctacaaatc tatctctctc tattttctcc ataataatgt 1620 

gtgagtagtt cccagataag ggaattaggg ttcctatagg gtttcgctca tgtgttgagc 1680 

atataagaaa cccttagtat gtatfctgtat ttgtaaaata cttctatcaa taaaatttct 1740 

aattcctaaa accaaaatcc agtactaaaa tccagatcat gcatggtaca gcggccgcgt 1800 

taacgegtat actctagagc gatcgcaagc ttggcgtaat catggtcata gctgttfccct 1860 

gtgtgaaatt gttatccgct cacaattcca cacaacatac gagccggaag cataaagtgt 1920 

aaagcctggg gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg cteactgccc 1980 

gctttccagt cgggaaacct gtcgtgccag ctgcattaat gaatcggcca acgcgcgggg 2040 

agaggcggtt tgcgtattgg gcgctcttcc gcttcctcgc tcactgactc gctgcgctcg 2100 

gtcgttcggc tgcggcgagc ggtatcagcfc cactcaaagg cggtaatacg gttatccaca 2160 

gaatcagggg ataacgcagg aaagaacatg tgagcaaaag gccagcaaaa ggccaggaac 2220 

cgtaaaaagg ccgcgttgct ggcgtttttc cataggctcc gcccccctga cgagcatcac 2280 

aaaaatcgac gctcaagtca gaggtggcga aacccgacag gactataaag ataccaggcg 2340 

tttacccctg gaagctccct cgtgcgctct cctgttccga ccctgccgcfc fcaccggatac 2400 

ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc aaagctcacg ctgtaggtat 2460 

ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg tgcacgaacc ccccgttcag 2520 

cccgaccgct gcgccttatc cggtaactat cgfccttgagt ccaacccggt aagacacgac 2580 

ttatcgccac tggcagcagc cactggtaac aggattagca gagcgaggta tgtaggcggt 2640 

gctacagagfc tcttgaagtg gtggcctaac tacggctaca ctagaagaac agtatttggt 2700 

atctgcgctc tgctgaagcc agttaccttc ggaaaaagag ttggtagctc ttgatccggc 2760 

aaacaaacca ccgctggtag cggtggtttt tttgtttgca agcagcagat tacgcgcaga 2820 

aaaaaaggat ctcaagaaga tcctttgatc ttttctacgg ggtctgacgc tcagtggaac 2880 

gaaaactcac gttaagggat tttggtcatg agattatcaa aaaggatctt cacctagatc 2940 

cfctttaaafct aaaaatgaag ttttaaatca atctaaagta tatatgagta aacttggfcct 3000 

gacagfctacc aatgcttaat cagtgaggca gctatctcag cgatctgtct atttcgttca 3060 

tccatagttg cctgactccc cgtcgtgtag ataactacga taagggaggg cttaccatct 3120 

ggccccagtg ctgcaatgat accgcgagac ccacgctcac cggctccaga tttatcagca 3180 

ataaaccagc cagccggaag ggccgagcgc agaagtggtc ctgcaacttt atccgccfcec 3240 

atccagtcta ttaat rtg t t g ccgggaagct agagtaagta gfctcgccagt taatagtttg 33 00 

cgcaacgttg ttgccattgc tacaggcatc gtggtgtcac gctcgtcgtt fcggtatggct 33 60 

tcattcagct ccggttccca acgatcaagg cgagttacat gatcccccat gttgtgcaaa 3420 

aaagcggtta gctcct t cgg tcctccgatc gttgtcagaa gtaagttggc cgcagtgtta 3480 

tcactcatgg ttatggcagc actgcataat tctcttactg tcatgccatc cgtaagatgc 3540 

ttttctgtga ctggtgagta ctcaaccaag tcattctgag aatagtgtat gcggcgaccg 3600 

agttgctctt gcccggcgtc aatacgggat aafcaccgcgc cacatagcag aactttaaaa 3660 

gtgctcatca ttggaaaacg ttcttcgggg cgaaaactet caaggatctt accgctgttg 3720 

agatccagtt cgafcgtaacc cactcgtgca cccaactgat cttcagcato ttttactfctc 3780 

accagcgttt ctgggtgagc aaaaacagga aggcaaaatg ccgcaaaaaa gggaataagg 3840 



18/1 1 -03 DIN 17:57 PAX + 32 9 ,331923 BAYER BIOSCIENCE NV ^ ^ ^ ^ . ^BO. 



95 



qcgacacgga aatgttgaat actcatactc ttcctttttc aatattattg aagcatttat 3900 

cagggttatt gtctcatgag cggatacata tttgaatgta tttagaaaaa taaacaaata 3960 

ggggttccgc gcacatttcc ccgaaaagtg ccacctgacg tctaagaaac cattattatc 4020 

atgacattaa cctataaaaa taggcgtatc acgaggccct ttcgtc 4066 
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